SlideShare a Scribd company logo
810
Congestion Prediction in Internet of Things Network using Temporal Convolutional
Network: A Centralized Approach
Vinesh Kumar Jain,#
Arka Prokash Mazumdar$
and Mahesh Chandra Govil%
#
Engineering College Ajmer, Ajmer, Rajasthan - 305 025, India
$
Malaviya National Institute of Technology, Jaipur, Rajasthan - 302 017, India
%
National Institute of Technology, Sikkim - 737 139, India
E-mail: vineshjain@ecajmer.ac.in
Abstract
The unprecedented ballooning of network traffic flow, specifically the Internet of Things (IoT) network traffic,
has big stress of congestion on today’s Internet. Non-recurring network traffic flow may be caused by temporary
disruptions, such as packet drop, poor quality of services, delay, etc. Hence, network traffic flow estimation is
important in IoT networks to predict congestion. As the data in IoT networks is collected from a large number of
diversified devices with unlike formats of data and manifest complex correlations, the generated data is heterogeneous
and nonlinear. Conventional machine learning approaches are unable to deal with nonlinear datasets and suffer from
the misclassification of real network traffic due to overfitting. Therefore, it also becomes hard for conventional
machine learning tools like shallow neural networks to predict congestion accurately. The accuracy of congestion
prediction algorithms plays an important role in controlling congestion by regulating the send rate of the source.
Various deep learning methods, such as LSTM, CNN, and GRU, are considered in designing network traffic flow
predictors, which have shown promising results. This work proposes a novel congestion predictor for IoT that uses
TCN. Furthermore, we use the Taguchi method to optimize the TCN model which reduces the number of experiment
runs. We compare TCN with the other four deep learning-based models concerning Mean Absolute Error (MAE) and
Mean Relative Error (MRE). The experimental results show that the TCN-based deep learning framework achieves
improved performance with 95.52% accuracy in predicting network congestion. Further, we design the Home IoT
network testbed to capture the real network traffic flows as no standard dataset is available.
Keywords: IoT; TCN; Congestion; Taguchi; Prediction; Classification
Received : 31 August 2021, Revised : 26 October 2022
Accepted : 14 November 2022, Online published : 6 December 2022
1. Introduction
The use of Internet-connected smart devices, otherwise
called the Internet of Things (IoT), is increasing rapidly
in our homes, offices, smart cities, and many other places.
These IoT devices are resource constrained, and a continuous
rise in the volume of network traffic may lead to network
congestion.1
A high traffic volume in IoT networks leads to
extra communication delay, low network throughput, and
waste of computing resources and bandwidth.2
These reasons
ultimately become the cause of congestion. Diagnosing
network congestion and building a framework that can classify
and predict it is one of the most important issues3
as it affects
the decision of the routing path process and optimal bandwidth
utilization. Therefore, for effective communication in the
network, it is essential to classify and predict congestion and
take appropriate preventive measures accordingly. Hence,
research for the classification and prediction of congestion
in the network, especially for IoT networks, is extremely
important.
Typically, state-of-the-art congestion prediction in IoT
networks relies on conventional techniques4–6
as used in
Wireless Sensor Networks (WSN). These research works are
focused only on the computing revolution, including the usage
of high-end computers and advanced algorithms for managing
traffic and performing flow prediction. Few efforts have been
made to forecast short-term traffic flow, including simulation
techniques,7
mathematical models, statistical approaches, and
regression techniques.
The data created by IoT networks are heterogeneous since
it is gathered from numerous diverse devices with different
data formats and complex correlations. As a result, it also
becomes very challenging for traditional machine learning
technologies, such as shallow neural networks, to effectively
forecast congestion. Machine learning methods attain poor
performance when dealing with high dimensional state spaces
in congestion control problems, and their performance does
not improve with increasing data. In particular, it is expensive
to measure the traffic volume when there is high traffic on the
IoT network.
The traditional feature selection techniques of machine
learning mainly consider the port and payload as features.
Reusability and unfixed assignment of port numbers make it
less effective to choose a port as a dominant feature in traffic
classification. In contrast, payload as a feature fails due
to encrypted traffic packet inspection. The problem in IoT
Defence Science Journal, Vol. 72, No. 6, November 2022, pp. 810-823, DOI : 10.14429/dsj.72.17447
© 2022, DESIDOC
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
811
networks, like congestion prediction, is an adversarial problem
due to random burst traffic and heterogeneity in devices and
protocols. In such cases, machine learning techniques that
learn from historical data can make the wrong prediction.
Recently, statistical feature-based machine learning techniques
have received attention, particularly in the field of wired and
wireless networking.3,8
However, adopting machine learning as
an invincible approach and using it for real-time IoT network
traffic is not good. In contrast, deep learning is a centralized
approach that uses basic information as direct input and
automatically performs feature extraction.
Deep learning eliminates the need for experts in a specific
area by utilizing hierarchical feature extraction, which makes
information distillation efficient, provides more abstract
correlations in the data, and reduces the pre-processing effort.
Deep learning algorithms can proficiently comprehend the
features of a large amount of data. Parallel computation using
graphics processing units (GPUs) allows deep learning to
draw inferences in milliseconds. This allows for high-accuracy
and timely network analysis and control, eliminating the run-
time constraints of conventional mathematical approaches
(e.g., convex optimization, game theory). The prediction of
congestion in IoT networks has a temporal dependency. The
temporal dependency means predicting congestion or a high
traffic volume at a specific time instance is done based on
past observations. Real-time prediction of network congestion
requires continuous feeding and learning. The number of
time slots may increase rapidly with time, which results in
high computational complexity and also affects prediction
accuracy. Recurrent Neural networks (RNN) and Long Short
Term Memory (LSTM) are the most common deep learning
methods to analyze time-series data. However, RNN models
cannot directly calculate the long sequence of time points.
In contrast, LSTM can process millions of time points, but it
works sequentially, making it inefficient in generating timely
responses. On the contrary, Temporal Convolutional Network
(TCN), a deep learning-based model, has a parallel processing
capability and training and forecasting time lower than
LSTM.9
Therefore, in this work, we use the TCN model to
predict congestion accurately.
Our proposal also leverages the dropout technique of deep
learning to cope with the overlearning problem. The proposed
approach works for the real IoT network traffic for congestion
prediction. In addition to state-of-the-art congestion control
techniques in IoT networks, our proposal is a centralized
approach toward congestion control that reduces the burden of
end devices on congestion prediction.
In summary, the following are the main contributions of
our proposed approach.
• We created a Home IoT network testbed and gathered the
network traffic flows of various IoT applications because,
as far as we are aware, there is no such labeled dataset of
IoT network traffic available.
• The traditional approaches use the brute force method to
tune the hyperparameters and structure of deep learning
models. Brute force approaches take a long time to train
the model and are inefficient for achieving high prediction
accuracy. The proposed approach uses a design of the
experiment (DoE) based Taguchi method as a novel way
to achieve the optimal structure of TCN to resolve this
issue for IoT network traffic prediction. Compared with
brute force approaches to tune and train the models, the
Taguchi method decreases the number of trials required
to tune hyperparameters and models, which results in the
accuracy of forecasting results.
• IoT networks have heterogeneous devices and irregular
traffic flow. In such cases, when we train the model, it may
be overfitted, and we do not have an accurate prediction as
desired. We use the concept of regularization by applying
the dropout layer to avoid overfitting.
• We use a deep learning-based imputation technique to
deal with missing values generated during data collection.
This improves overall accuracy by converting incomplete
data to complete data.
• To the best of our knowledge, TCN has not previously
been applied to congestion prediction and detection
problems. Consequently, the work discussed in this paper
is original by its very nature.
The contour of the paper is as follows. Section 2 presents
state-of-the-art work in the field of IoT network traffic
classification and prediction. Section 3 formulates IoT network
traffic problems and emphasizes the necessity of deep networks
for IoT network traffic classification and prediction along with
the proposed methodology. Section 4 represents the structure
of TCN and hyper-parameter tuning by the Taguchi Method.
In Section 5, empirical results of deep learning models about
the accuracy of prediction and the role of activation functions
in enhancing real-time prediction accuracy are presented. The
conclusion and future scope of this work are presented in
Section 6.
2. RELATED WORK
Most of state-of-the-art in the area of IoT networks is
related to IoT congestion control,7,10
classification of network
attacks based on the traffic,11
IoT device classification,12
and
traffic classification13
incorporating machine learning and deep
learning. In this section, we introduce and analyze state-of-the-
art.
Heterogeneous vehicular networks may experience
congestion due to increased resource demands for data
transmission as a result of the rise in intelligent vehicle adoption
and the popularity of various safety and comfort applications.
Safety applications, including emergency braking systems,
traffic danger alerts, and collision avoidance signals, require
low latency and high bandwidth. Unfortunately, congestion
often reduces network performance and customer pleasure
(QoS) by reducing the quality of service. Reduced QoS is
also a result of inadequate mobility models, routing protocols,
and communication mediums. Falahatraftar, et al.14
proposed
a congestion prediction and avoidance approach named
Intelligent Congestion Avoidance Mechanism (ICAM) based
on the Generalized Regression Neural Network (GRNN). The
authors compared the proposed GRNN congestion prediction
Def. SCI. J., Vol. 72, No. 6, november 2022
812
model with other widely used models, including Multiple
Linear Regression (MLR), Support Vector Machine (SVM) for
Regression, Decision Tree Regression (DTR), and Multi-layer
Perceptron for Regression (MLPR). The proposed GRNN
congestion prediction model performs than other models in
terms of accuracy, dependability, and stability, according to
numerical data. In addition, simulation findings demonstrate
a significant network performance gain in terms of packet
delivery ratio, average latency, and packet loss ratio when
compared to conventional congestion control strategies.
Demir, et.al15
proposed mlCoCoA, a machine learning-
based enhancement in CoCoA [16], the advanced variant of the
Constrained Application Protocol (CoAP). CoCoA is used to
control congestion by estimating the Re-transmission Timeout
(RTO) values, as shown in equation 1. RTT is the round trip
time, and RTTVAR gives the difference between the successive
RTT and current RTT measurements. Here, x is a variable that
denotesthestrongandweakRTT.StrongRTTiscalculatedwhen
an acknowledgment is received without any retransmission.
Weak RTT is calculated when an acknowledgment is received
after the number of retransmission. Here α, β, λ, and K are the
constants, and their values are defined statically.
x x x x new|
RTTVAR (1 )*RTTVAR *| RTT RTT −
= −β +β −
x x x new|
RTT (1 )*RTT *RTT −
= − α + α
x x x x
RTO SRTT k *RTTVAR
= +
overall x overall
RTO *RTO (1 )*RTO
= λ + − λ
(1)
TheapproachmlCoCoAclaimsthatthedynamicprediction
of the values of these constants using machine learning can
increase the network throughput of the IoT network. The
authors performed the set of experiments by creating the
CoCoA network for collecting the ground truthα, β, λ, and K.
The SVM predicts the values of these constants. The precise
estimation of RTO provided by the accurate prediction of these
values reduces unnecessary retransmission.
Sander,10
et al. proposed DeePCCI to identify congestion
based on packet arrival time (passive entity) in traffic flow.
Being based on the usage of passive knowledge, it works on
encrypted transport headers also. DeePCCI architecture is the
combination of CNN and LSTM, which are parts of a Deep
Neural Network (DNN).The only feature taken is packet arrival
time, whose histogram bins of size 1 ms are generated and then
fed into the DNN and extracted through 2D VGGNet-13.18
The authors used the single host, multiple hosts, and mininet-
based network testbeds by using the standard Linux 4.18
Kernel to generate the network traffic. The authors evaluated
the performance on various vantage points after and before the
bottleneck giving the final accuracy of the chosen network in
BBR, CUBIC, RENO, CUBIC-p, and RENO-p. The authors19
analyzed the performance of DeepCCI and showed that it has
limited performance because the training is done using testbed-
generated data that lacks the Internet’s inherent noise. Further,
they observe that DeepCCI needs to be re-trained for different
kernels to capture behavioral differences appropriately.
Xiao,7
et al.proposed TCP-Drinc for smartly controlling
congestion for the TCP variants. The authors use Deep
Reinforcement Learning20
which learns from experience.
Congestion is controlled by adjusting the window size. The
authorsusedthedeepconvolutionsneuralnetworkandextracted
stable features from abundant but noisy data measurements.
TCP-Drinc is compared with the various versions of TCP, i.e.,
TCP-New Reno,21
TCPCubic,22
TCP-Hybla,23
TCP-Vegas,24
and TCP-Illinois.25
TCP-Drinc yields maximum throughput
and the second lowest Round Trip Time for the entire period
of propagation delay. TCP-Drinc has a higher round trip time
than TCP-Vegas.
Najm,17
et al.proposed a novel machine learning-based
model built upon a decision tree algorithm for predicting
congestion in 5G IoT networks. The authors determine the
optimal congestion window using specific network conditions,
including high throughput, high congestion window, large
queue size, and low loss. The authors applied C4.5, RepTree,
and Random Tree-based decision tree approaches along with
clustering and stacking approaches. The results show that the
C4.5decisiontreealgorithmhasthebestperformancecompared
to other machine learning algorithms; The authors present a
tree-based graph that suggests the optimal path through which
congestion can be controlled. The authors compared their
model with the original Stream Control Transmission Protocol
(SCTP) and claimed a 14.5 % improvement in performance.
Doshi,11
et al.worked on detecting distributed denial of
service (DDoS) attacks using well-known machine learning
algorithms in the IoT network. A large number of Botnets like
Mirai leveraged vulnerable IoT devices for performing DDoS
attacks in IoT networks. The network includes IoT devices
such as home gateway routers and other middleboxes for
keeping track of traffic flow between IoT devices on the local
network and the rest of the Internet. Binary classification is
applied to separate the data for both DDoS attacks as 0 and
1. Data is collected from the devices, such as home cameras
and Bluetooth devices connected to blood pressure monitors,
all connected to the router via Raspberry Pi. Non-IoT traffic is
filtered out from the pcap files.
Sivanathan,12
et al. proposed a robust approach for
classifying IoT devices using traffic characteristics that are
acquired at the network level. A setup with 28 IoT devices and
a few non-IoT devices was created to validate the proposed
approach. OpenWrt was used to monitor the traffic flow for 26
weeks.26
Each of the devices showed some pattern for different
network characteristics. Among all the features, nine features
were selected to perform the classification: a bag of a port
number, a bag of domain names, a bag of cipher suites, flow
volume, DNS interval, flow duration, sleep time, flow rate,
and NTP interval. All the data with these features were fed
into a two-stage machine learning model, which predicted the
confidence for a particular type of IoT device. The first stage,
called Stage 0, implements the Naive Bayes Multinomial
Classifier independently for features, viz., a bag of the port
number, bag of domain names, and bag of cipher suites. Each
of these features returned moderate accuracy for different
devices. Then the output of all these three classifiers is fed into
the second stage, i.e., Stage 1. In Stage 1, the outputs from
the Naive Bayes Multinomial Classifier and the remaining
six features were fed into the Random Forest Classifier,
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
813
which provided the required confidence for each input. This
confidence is then mapped to the IoT device’s classes, and the
authors obtain the desired classification of IoT devices.
Lopez, et al.13
proposed a network traffic classifier (NTC)
to predict the network services among different communication
protocols. The proposed network traffic classifier considers
the flow-based features, e.g., source and destination ports and
bytes transmitted per packet. The authors used the different
deep learning models and their combinations (RNN, CNN,
and CNN+RNN) to infer the network services. NTC uses the
RedIRIS dataset consisting of 266,160 network flows from
108 services to validate the approach. The results show that the
combination of CNN+RNN has the best accuracy in terms of
F1 score, precision, and recall.
Table 1 summarises each of the papers discussed in this
section considering category, classification methods used,
selected features, traffic (simulated or real), and year.
Paper Category DL Method Features Traffic Year
Falahatraftar[14] Congestion Control Regression Neural
Network
No. of vehicles, data rate, transmission
power, and bandwidth
OMNeT++ 2022
Demir [15] Congestion Control SVM RTT, RTO Cooja Network 2020
Sander[10] Congestion Control RNN+LSTM Packet arrival Mininet 2019
Xiao[7] Congestion Control Deep Reinforcement
Learning
Size of congestion window, RTT,
inter-arrival time of ACK
NS-3 2019
Najm[17] Congestion Control Decision tree CWND, Throughput, Queue Size,
Packet Loss
NS-3 2019
Doshi[11] DDoS attacks KNN,LSVM,DT,RF Packet header fields, flow information
over very short time windows
Consumer IoT
device network
2018
Sivanathan[12] IoT device
classification
Multi-stage machine
learning
Activity cycles, port numbers,
signaling patterns, and cipher suites
Real network
Traffic
2018
Lopez [13] Protocol-wise
traffic Classification
CNN+RNN Source port, destination port, payload
size, TCP window size, inter-arrival
time, and flow direction
Real data from
RedIRIS
2017
Table 1. State-of-the-art comparison
3. PROBLEM STATEMENT AND METHODOLOGY
Real-time traffic volume prediction plays a vital role
in proactive network traffic management. Many forecasting
models have been proposed to address this issue.26-27
However,
most of them suffer from the inability to fully utilize the rich
informationintrafficdatatogenerateefficientandaccuratetraffic
predictions from longer-term data (e.g., seven-day predictions
at a five-minute interval). The accuracy of prediction of traffic
flow depends on the availability of historical data along with a
reliable technique that can filter the intrinsic properties from the
dataset to precisely forecast future flows. Let, the IoT network
has the N number of IoT devices. R is a volume of real network
traffic at a time t at the gateway node represented by (R, t). R
at (t + n) is the volume of real network traffic for the next time
point t + n and is represented by (R, t + n). The traffic volume
prediction problem at time t is defined to find the predictor of
(R, t) via a sequence of historical or training traffic datasets
(Rt−1
, Rt−2
, Rt−3...,
Rt−T
). The key problem here is to develop or
apply a model that defines the inherent relationships between
the historical or training dataset and prediction of network
traffic so that the network system can be able to accurately
predict the traffic volume to control congestion.
3.1 METHODOLOGY
This section describes a proposed methodology for
predicting congestion in the IoT network as shown in Fig. 1.
The big issue in our work was the lack of a publicly available
labeled dataset of congestion scenarios for IoT networks. To
the best of our search, we could not find any such dataset to
solve the problem statement.
Therefore, as the first step, we set up our own IoT home
testbed, as shown in Fig. 2. There are two main reasons to
design a home IoT testbed (i) to capture network traces from
real-world IoT devices and (b) to develop a realistic smart
environment for evaluating the proposed approach. Figure 2
describes the proposed IoT testbed environment. In our IoT
network setup, we use the computer (Ubuntu installed) having
two network interface cards (NIC1, NIC2). NIC 1 serves as a
gateway and connects to the Internet, while NIC 2 uses an L2
(S.No 13) switch to connect to the rest of the network. We use a
4G TP-Link MR6400 (S.No 14) router running CoAP, DHCP,
NAT, and DNS forwarding. The public Internet is accessible
through this TP-Link. An Unifi Access Point (S.No 12) is
connected to the L2 switch. The devices that are connected
with Ubiquiti Access Points are as follows.
• General-purpose hubs (3, 7)
• Several consumer electronics (4, 5, 6),
• Two smart plugs (9) can accommodate extra offline
devices
• Environmental devices (1, 8, 2).
With the L2 switch, the device-specific hubs (10 and 11)
are linked. These hubs control the devices that are plugged
into them. L2 switch routes all incoming and outgoing traffic,
Def. SCI. J., Vol. 72, No. 6, november 2022
814
which mirrors all testbed traffic on the switch port attached to
the Ubuntu PC. On the Ubuntu PC, we set up the following
packages: Wireshark (for traffic capture), block-mount package
kmod-USB-core (for mounting external USB base memory),
and package kmod-USB-storage (for storing the network
traffic traces on USB storage). A bash script is run to automate
the data collecting and storage. This execution of script records
pcap files on an external USB storage device that is connected
to the PC.
We collected the network traffic traces after the testbed
has been set up. We introduced a burst into the network to
generate congestion because the goal is to predict network
congestion. We collect the network traces in the form of pcap
files using Wireshark. This network data contains traffic traces
and session logs of different devices at network layers as per the
requirement of applications. For example, congestion detection
problem in IoT network often needs datasets of packet-level
traces labeled along with corresponding classes. Wireshark’s
filters are used to segregate congestion states from the normal
traffic states. We filtered congestion traffic with the help of Bad
TCP, ICMP errors, Retransmission, and Duplicate ACK filters
and stored this traffic in the pcap file named “congestion.” The
rest of the traces of normal traffic are stored in another pcap file
named “normal.”
Figure 1. Methodology.
Figure 2. IoT home testbed for data collection.
We use CICFlowmeter to convert these two pcap files
to comma-separated values (CSV) files. CICFlowMeter is a
network traffic flow generator and analyzer that produce 83
network traffic features from the pcap files. Hence, there are 83
columns in the CSV file. CICFlowMeter provides bidirectional
flows from source to destination and destination to a source
with duration, number of packets, number of bytes, length of
packets, etc. These features are calculable separately in both
directions. These columns contain low-level features (Flow
ID, Source IP, Destination IP, Source port, Destination port,
and protocol) and other high-level features. We added class
labels (Congestion, No Congestion) in the last column of both
CSVs.
WeperformtheExploratoryDataAnalysis(EDA)29
fordata
pre-processing. EDAhelps to find various information about the
data, such as missing values, normalization, mean categorical
data, median, distribution of data, and correlations. We carry
out the EDA using the Profiling tool. After removing all the
noise, inconsistency, and missing values, we obtain a complete
and accurate dataset. Now, we start looking for relations among
various features of the data. The feature selection process can
be done manually or using some mathematical or algorithmic
techniques. In the case of manual feature selection, one must
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
815
have a thorough understanding of the features, the relationship
between them, and how one feature group would perform better
than the other feature groups. On the contrary, algorithmic
feature selection lowers the role of manual efforts. In addition,
compared to manual feature selection, it avoids errors in feature
selection and takes less time to identify the most trustworthy
features. Therefore, algorithmic feature selection is preferred
over manual selection.
In this study, we employ a correlation matrix and Principal
Component Analysis (PCA), a popular feature selection
algorithm, to analyse the relationship between various features
inthedataset.30
Thecorrelationmatrixisasymmetricmatrixthat
defines the relationship between various fields. In mathematical
terms, a correlation matrix is a matrix of correlation coefficients
between a set of variables. The diagonal of the correlation
matrix is filled with 1s because the correlation between the
same variables is 1. The coefficient is between -1 and 1, where
1 shows a 100 % positive linear correlation, while -1 depicts
a 100 % negative linear correlation. The correlation matrix
is a symmetric matrix that defines the relationship between
various features. In mathematical terms, a correlation matrix
is a matrix of correlation coefficients between a set of features.
The coefficient is between -1 and 1, where 1 shows a 100 %
positive linear correlation, while -1 depicts a 100 % negative
linear correlation. The diagonal of the correlation matrix is
filled with 1s because the correlation between the identical
features is 1. Furthermore, 0 means the two features are not
correlated. The Pearson correlation coefficient31
metric is used
to evaluate the correlation between the features. It measures
the linear correlation between the two features using Eqn. 2.
We are using pandas profiling to find the highly correlated
features.
( )
( )
( ) ( )
Cov X,Y
X,Y =
Var X Var Y
ρ (2)
In our work, the correlation is considered high if the
Pearson correlation coefficient is larger than 0.9 or less than
-0.9. We started with the dataset consisting of 83 features. We
dropped 30 highly correlated features, along with the features
with constant values such as 0 simultaneously. After this, there
were still 53 features and one field for the label. All 53 of these
features were useless for our prediction, thus, we eliminated
all irrelevant information and retained only those necessary for
the system to learn. The following columns, as indicated in
Table 2, are removed during the learning process based on the
results of pandas profiling.
Flows ID, Destination IP, Source IP, and Timestamp are
categorical and sparse features that increase the space and time
complexity of the models. The historical time series data, which
contains a timestamp, is used to predict the future. As a result,
we retain only the timestamp and drop the Flow ID, Destination
IP, and Source IP. We use one-hot encoding32
to transform this
categorical feature into a numerical attribute. Now, we have
50 features in our dataset. For feature selection, we used PCA.
On applying PCA to our dataset, we got useful features such
as Flow Duration, Fwd Pkt Len Mean, Flow Pkts/s, Bwd IAT
minimum, Fwd Pkts/s, and Bwd Pkts/s. Initially, we decided
to go with these seven features along with a timestamp for
prediction and classification. However, since deep learning
approaches are capable of recognizing features automatically
from data, we are not overly concerned with selecting more
features.
We also study IoT traffic flow characteristics, including
flow size, flow duration, flow rate, and inter-arrival time (∆T)
in forward and backward directions. These characteristics
will also help advance prediction of congestion on the nth
timestamp. We also use inter-arrival time (∆T) and flow rate
as the target variable for advance prediction. We provide
definitions for some terms related to network traffic flow.
• Flow size is the number of bytes sent with the payload,
including retransmissions.
• We can measure flow duration as the difference between
the timestamp of the first packet and the last packet in a
flow.
• Flow rate is the ratio of flow size to flow duration and is
represented with flow packets/second
Flow size is a valuable statistic that is used to optimize
routing algorithms, load balancing, and scheduling in IoT
networks. Predicting flow size is especially difficult because
flow patterns change constantly, and calculations must be
made in real-time (milliseconds) to prevent delays. This
is because the size of flows varies greatly (from small mice
flows of a few kilobytes to large flows of several gigabytes), as
Active Max Subflow Bwd Packets Bwd IAT Max
Active Min Idle Min Fwd Bulk Rate Avg
Bwd PSH Flags Packet Length Min Bwd URG Flags
Idle Max Fwd IAT Total Bwd Bytes/Bulk Avg
Bwd Segment Size Avg Fwd Packet/Bulk Avg Fwd URG Flag
URG Flag Count Active Mean ECE Flag Count
Fwd Bytes/Bulk Avg ACK Flag Count Fwd Packets/s
Bwd Packet Length Std Subflow Bwd Bytes Fwd IAT Std
Bwd IAT Std Fwd IAT Mean Active Std
Packet Length Mean Fwd IAT Max Fwd Segment Size Avg
Table 2. Features to be dropped
Def. SCI. J., Vol. 72, No. 6, november 2022
816
well as the length of time they inhabit a route (milliseconds to
hours). Since large flows hold the route for a longer time (flow
duration) than small flows, there’s a chance that the number of
active flows will become unbalanced due to a concentration of
large flows in the routing path.
The quantity of available data determines the selection of
a deep learning model. The data patterns may or may not be
consistent due to the irregular traffic flow. The chances of the
overfitting of the model may be high when we collect the data
from these networks. The selection of the model in these types
of cases is crucial. For example, accurate congestion prediction
can suggest how much the sending rate of a node in the IoT
network should be controlled. The deep learning algorithms
may reduce the chances of overfitting.
Motivated by the excellent performance of deep learning
techniques in images and natural language processing, we select
TCN, LSTM, Gated recurrent unit (GRU), Stacked autoencoder
(SAE), and CNN-LSTM models for the solution of proposed
IoT network traffic classification and prediction approach.33
These techniques have the capability of automatically
learning the features. The model validation verifies the overall
accuracy of the model in terms of overfitting and underfitting.
This verification of accuracy helps to optimize the model by
increasing the volume of data and reducing the complexity of
the model when data overfits. Analysis of wrongly classified
samples helps to find the reasons for errors and check the
suitability of the model and features. The processed dataset is
divided into three parts, i.e., training (60 %), validation (20
%), and testing (20 %). We train different models with training
data using various combinations and different values of hyper-
parameters. Further, re-training is done on the different training
and validation dataset combinations using the same set of
hyper-parameters, and the accuracy is calculated for testing the
dataset. We observe overfitting initially, as the training score
is high, but the validation score is low. This score shows that
the model has adapted well to the specific training dataset.
However, the trial-and-run process to train a model is time-
consuming. Therefore, we have used the Taguchi method to
optimize the hyper-parameter and train the model. This method
reduces the number of trials to optimize hyper-parameters.
4. MODEL CONSTRUCTION AND PARAMETER
TUNING
In this section, we represent the basic architecture for
convolution sequence forecasting and hyper-parameter tuning
which is an integral part of the neural network. The series of
IoT network traffic flow of length T with the input sequence
{p1
,p2
,.....pT
} is utilized for predicting the output sequence of
traffic flow {q1
,q2
,.....qT
}. Here, observed data {p1
,p2
,.....pt
} is
the input to the model for the forecast of the output t
q at the
time t. Equation 3 defines the function t t
f:P Q
→ prediction
of the traffic flow.
{ } ( )
{ }
1 2 T 1 2 T
q ,q , q = f p ,p , p
…… ……
  
(3)
Here, the function f must hold the condition that the t
q
depends only on historical data { }
1 2 t 1
p ,p ,......p − rather than
{ }
t+11 t+2 T
p ,p ,......p . Equation 4 defines the function to forecast
network traffic flow t
q at time t as follows.
( )
{ }
t 1 2 T
q = f p ,p , p
……

(4)
4.1 Temporal Convolutional Network
TCN34
isthevariationofCNNforsequencemodelingtasks.
TCN is designed by combining the features of the RNN and
CNN. TCN uses casual convolutions and dilated convolutions.
The casual convolution is used for temporal data and ensures
that the model will not violate the sequence in which the data
is modeled. The prediction relies only on the historical data,
not on future observations. TCN model inherits the property
of 1-D fully convolution of generating the same length output
as the input, and subsequent layers maintain the same length
by adding zero padding to avoid information leakage. The
accurate prediction of the traffic flow requires a large history
and a very deep network. This makes the network structure
of the model complicated and computationally intensive. The
integration of dilated convolutions and residual layers with
TCN solves this issue. Dilated convolutions, in particular,
allow for an exponentially large receptive field to cover large
history. Equation 534
shows the dilated convolution operation F
on elements of the sequence s for the given 1-D sequence input
n
p R
∈ and a filter { }
f: 0.k 1 R
− → .
( ) ( )( ) ( )
k 1
f s d.i
i=0
F s = p d s = f i .p
−
−
∗ ∑ (5)
Here, d represents the dilation factor; i stand for past
direction, k denotes the size of filter f, and i
d
s .
− maintain
the history. The value of dilation factor d = 1 for dilated
convolution in equation 5 works as the standard convolution.
The exponential adjustment to d increases the receptive field
size that covers the large history with the depth of the deep
learning network. This represents the large range of inputs
by leading the output at the top level. Filter size k is used to
increase the receptive field of TCN. However, this approach
increases parameter count and execution time, resulting in
slow and overfitting of the model.
The residual connection is the next important part of TCN
that includes a branch towards a series of transformations F,
whose outputs q are connected to the input p of the block.
Figure 3 shows the diagram of the residual block. This block
contains two weight layers along with the Rectified Linear
Unit (ReLU) activation function. Moreover, regularization
is achieved by connecting a spatial dropout layer after the
last weight layer. Here, Eqn. (6) defines the residual block
mathematically.
( )
i
q = F p,W + p (6)
Here, q is the output layer. The function F(p,Wi
) shows the
residual mapping to be learned, where Wi
denotes the weights
of ith
layer. The function for two layers of TCN is represented
as 2 1
F W (W p) e,
= σ + in which σ and e represent ReLU and
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
817
bias, respectively. To predict and detect network congestion in a
home IoT network, we use TCN architecture. TCN architecture
is made up of a collection of blocks; each block contains a set
of N number of convolutional layers. Layers are made up of
dilated convolutions with a dilation factor d and a nonlinear
activation function f(.) associated with it. The convolution
outcome is integrated with the layer’s input by using a residual
relation in each dilated convolution.
( )
( )
t
2
i
i=0
1
L = q F i
t
−
∑

(11)
Here, Fig. 4 shows TCN model architecture that uses 19
convolutional layers and a fully connected layer. Convolutional
layers are shown in a rectangular area with a grey background.
In Figure 4, the grey rectangles in the shaded rectangle show
that the result of convolutional layer 1 is applied as the input of
convolution layer 3 by using the residual structure directly. The
kernel size 10 and filter size 24 are used for all the convolutional
layers. TCN model is trained with the processed training data.
The final TCN with the enhanced configuration is transmitted
to the central server depending on the feedback of the historical
traffic flow data sequence and is used for forecasting packet
flows/sec in the next 30 minutes in the network given the past
traffic flow in the dataset.
Figure 3. TCN residual block.
Equation (7) describes the activation function for the ith
layer and jth
block
( ) F
i.j w T
S R
×
∈ (7)
Here, we can note that every layer has the same number of
filter Fw
. We define the output of the dilated convolution at time
t as j,l
t
S

and the convolution result after the residual connection
l
j,
S as shown in Eqn (8).
( )
j,l 1 j,l 1 2 j,l 1
t t d t
S = f W S + W S + b
− −
−

j,l j,l
j,l t t
S = S + VS + e (8)
Here,
1
W and 2
W denote the weight parameters.
F F
w w
V R
×
∈ is the set of weights and
Fw
e R
∈ represents the
biases for the residual, respectively. The output of each block
is summed by a group of skip connections with T F
0 w
t
Z R
×
∈
satisfying Eqn. (9)
B
0 j,L
t t
j=1
Z = ReLU S
∑ (9)
The latent result 1
t
Z is ( )
0
r t r
ReLU V Z + e for the weight
matrix Vr
and the bias er
. We can represent the forecasting
through Eqn. (10)
( )
( )
1
t t
q = softmax UZ + c

(10)
Here, U is the weight matrix
C Fw
U R
×
∈ and bias C
c R .
∈
Equation 11 shows the objective function of the model. We
aim to minimize the value of L by keeping the training of the
data. At time t, qt
gives the forecasting result. The initial traffic
flow at time t is represented by F(t). Parameters are configured
as follows at the time of training of the TCN model; batch size
128, number of epochs 30, dropout as 0.5, and initial learning
rate as 0.002. We use the stochastic gradient descent technique
to decrease the learning rate at the time of training.
Figure 4 TCN architecture.
4.2 Hyper-parameter Tuning
Building a deep learning model for network traffic flow
prediction requires the best combination of neural network
parameters such as kernel size, number of filters, dilations
list, count of residual block stacks, etc. Traditionally, a trial-
and-error approach is used to find out the best combinations of
these hyper parameters. However, it is inefficient in yielding
high precision and time-consuming because it requires a huge
number of trials. This prompted us to look for more coherent
alternatives to the traditional trial-and-error approach. Instead,
Zhao, et al.34
used the Taguchi approach to predict short-term
traffic movement accurately or identify congestion sites for
intelligent transportation systems (ITS) in smart cities. We
believe that the problem of road traffic congestion is similar
Def. SCI. J., Vol. 72, No. 6, november 2022
818
to the problem of network congestion, so we also incorporate
the Taguchi method to determine the optimized values of the
hyper-parameters of the deep learning model for IoT network
congestion prediction. We have executed a small series of
experiments to see if we could improve the structure of deep
learning models that were trained by the training dataset. The
trialrunresultsshowthedisparitybetweentheexpectedandreal
performance. The findings of the trial run help determine the
most suitable hyper-parameter values for efficient performance
metrics. The topology of the deep learning-based TCN model
is then determined. The Taguchi approach is divided into
three parts: (i) Design factors identification, (ii) Trial design
and performance metrics, and (iii) Optimization of the model
structure and performance analysis.
The first step of the Taguchi method is identifying the
design factors and their corresponding levels. These design
factors greatly influence the topology determination of the
deep learning models. The number of filters used in deep
learning models is considered the first design factor. The state-
of-the-art shows that the most frequent number of filters used
is 6, 12, and 24. We follow the same practice and use the 6, 12,
and 24 filters at Levels 1, 2, and 3, respectively. The size of the
kernel is the second design factor. We use the 10, 15, and 20
kernel sizes at each layer of the deep learning model for Levels
1, 2, and 3, respectively. The list of dilation is the third design
factor. The list of the dilation gives the size of the deep learning
model. We set the list A {20
, 21
, 22
, 23
, 24
}, list B {20
, 21
,
22
, 23
, ..., 29
}, and list C as {1, 2, 3, ..., 9} for Levels 1, 2, and
3, respectively. The number of stacks of residual blocks is the
fourth design factor. The state-of-the-art suggests the use of 2
to 9 stacks of residual blocks. We set the numbers 2, 6, and 9
stacks of residual blocks for Levels 1, 2, and 3, respectively.
We assess the performance of deep learning models for
congestion prediction using Mean Absolute Error (MAE) in
Eqn. (12) and Mean Relative Error (MRE) in Eqn. (13). These
performance metrics show the absolute and mean difference
between the actual network flow and the one predicted by the
model, respectively.
( )
n n
i i
i=0 i=0
1
MAE = f f
n
−
∑∑

(12)
( )
n
i i
i=o i
f f
1
MRE =
n f
−
∑

(13)
Here, i
f

and i
f are actual and predicted traffic flow
duration respectively. Here, n shows the number of forecast
points. Lower MAE and MRE are the indicators of less
difference between the actual flow and predicted flow, which
shows better accuracy of prediction.
As we have three levels and four design factors, the full
factorial design requires a total of 81 trials. As prediction
problem requires a large amount of historical data, full factorial
design may be time-consuming. Therefore, we use fractional
factorial designs. As with the consideration of four design
factors along with three levels for each, we build an orthogonal
arrayA16
(34
)forthetrialdesign.Wehaveoptimizedthestructure
of the deep learning model by executing the combination of
design factors along with their levels in 16 trials. Every row of
the orthogonal array A16
(34
) related to the main trial develops
a deep learning model using a training dataset of the home IoT
testbed. Table 3 shows the results of the 16 main trials for the
seven days (March 1-7, 2021), along with the average result
of the testing dataset. The accuracy of deep learning models is
evaluated by the collected data of the Home IoT testbed over
seven days with 16 main trials. The results in Table 3 depict
Main Trials
Level Day MAE MRE Average Result of 5 Days
i ii iii iv MAE Rank MRE Rank
1 1 1 1 1 1st
65.783 0.899 59.532 16 0.447 16
3rd
17.12 0.335
5th
44.082 0.125
2 1 2 2 2 1st
20.991 0.255 21.732 7 0.1 5
3rd
0.128 0.004
5th
3.682 0.048
3 1 3 3 3 1st
23.323 0.286 24.258 10 0.11 8
3rd
1.387 0.029
5th
1.485 0.02
4 1 1 2 3 1st
293501 0.348 29.729 15 0.165 14
3rd
5.987 0.144
5th
8.229 0.108
5 1 1 3 1 1st
33.211 0.396 28.502 14 0.164 13
3rd
7.532 0.168
5th
2.621 0.033
Table 3. Orthogonal array A16
(34
) and experimental outcomes
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
819
that the smallest MAE and MRE were obtained within the 16th
trial with 24 numbers of filters, 2 numbers of stacks, a 15 kernel
size, and the dilation with list A. In Table 3 top three trials are
the 7th
, 11th
, and main trial 16th
with list A setting of dilation.
This concludes that the dilation with listAgives better accuracy
than the other two lists. The main trials 1st
, 4th
, and 5th
with the
6 filters and small kernel size perform poorly concerning MAE
and MRE. This concludes that the performance of the deep
learning models varies with the change in the combinations of
the parameters.
Every trial under the Taguchi method has orthogonal
combinations. Hence, we can separately analyze the effect of
individual design factors. The effect of every design factor at a
specific level is determined by averaging the respective values
in Table 3. In particular, the design factor i with Level 2 is in
6th
, 7th
, 8th
, 9th
, and 10th
main trials, and we can find its average
effects 23.9428 (MAE) and .121 (MRE).
Table 3 shows the effects of all four design factors at all
their corresponding three levels. The sensitivity (identification
of true positive) of design factors is also determined using
range analysis, which is defined as the difference between the
highest and lowest performance values for each design factor.
Table 4 displays the sensitivity data. We can find the order of
the sensitivity by sorting the sensitivity values of four design
factors from high to low, as i>ii>iv>iii in terms of MAE. The
same concept applies to finding the order of the sensitivity in
terms of MRE, and the order is as i>ii>iv>iii. These orders
depict that the design factor i has the lower effect value of
MAE and MRE; hence, it is considered the primary factor
for the TCN forecasting model for better performance. Here,
Tables 3 and 4 help to determine the optimized parameters of
the TCN model for the IoT network traffic flow forecasting in
the Home IoT testbed. These parameters are the design factors
and i, ii, iii, and iv with values that exist at Level 2. TCN model
6 2 1 2 3 1st
23.334 0.275 20.936 4 0.104 7
3rd
1.876 0.044
5th
1.997 0.025
7 2 2 1 3 1st
16.305 0.201 20.326 3 0.093 3
3rd
1.249 0.027
5th
4.948 0.066
8 2 3 3 1 1st
34.147 0.397 28.396 11 0.146 11
3rd
1.771 0.038
5th
5.597 0.071
9 2 1 3 3 1st
29.527 0.342 28.5 12 0.162 12
3rd
4.212 0.095
5th
11.494 0.149
10 2 3 1 3 1st
19.635 0.229 21.556 5 0.1 5
3rd
1.916 0.044
5th
3.414 0.043
11 3 3 1 2 1st
13.916 0.165 20.029 2 0.092 2
3rd
2.838 0.064
5th
4.823 0.062
12 3 3 2 1 1st
39.97 0.466 28.755 13 0.197 15
3rd
9.729 0.218
5th
9.857 0.128
13 3 1 1 1 1st
1.488 0.018 21.785 6 0.094 4
3rd
5.184 0.116
5th
9.103 0.117
14 3 1 2 1 1st
22.765 0.255 24.963 9 0.115 9
3rd
0.436 0.015
5th
5.314 0.067
15 3 2 2 2 1st
33.132 0.387 24.834 8 0.143 10
3rd
4.307 0.094
5th
2.675 0.033
16 3 2 1 1 1st
20.191 0.234 16.156 1 0.081 1
3rd
0.306 0.008
5th
1.817 0.025
Def. SCI. J., Vol. 72, No. 6, november 2022
820
with filter counts as 12, size of the kernel as 15, dilation list B,
and 6 residual blocks gives the highest accuracy in forecasting
among all the Taguchi experimental trials.
Design Factor
MAE MRE
i ii iii iv i ii iii iv
Level 1 32.6 30.5638 26.564 29.727 0.1972 0.1787 0.1511 0.1777
Level 2 22.3428 20.762 25.1581 22.1983 0.121 0.1042 0.1373 0.1116
Level 3 22.7536 24.5988 27.4135 24.2175 0.120 0.129 0.1455 0.1223
Sensitivity 10.2572 9.8018 2.2554 7.5287 0.0772 0.0745 0.0138 0.0661
Table 4. Sensitivity values for design factors
5. RESULTS
On the basis of the prepared traffic dataset, we preformed
several experiments. This work compares TCN with four other
deep-learning methods commonly used as traffic flow predictors,
including the LSTM, GRU, SAE, and CNN-LSTM, concerning
two performance metrics, MAE and MRE, to show the TCN
model’s prediction accuracy. All experiments are implemented
in Tensorflow, and performance matrices are calculated using
scikit-learn package 6. We use i7-9700, a desktop computer
with 16 GB of RAM, in all experiments. Here, IoT traffic
includes two types of traffic: first, traffic that flows through the
devices autonomously independent of user activity; second,
traffic that flows due to user communication with the devices,
e.g.,Airveda SmartAir Quality Monitor responding toAndroid
user, Amazon Alexa Echo, Neato vacuum cleaner, so on.
Flow rate and inter-arrival time are the attributes used
for prediction of congestion in our approach. We analyze
the estimation of congestion in live network traffic further
to test the success of the deep learning methods. The TCN-
based approaches effectively anticipate congestion condition
in contrasting situations and ground-truth information. As
the nature of our IoT network traffic is unbalanced at some
points, the error increases slightly during peak traffic hours.
However, adding the dropout layer in the model to perform
regularization results in a decreased gap between actual and
predicted values during peak traffic hours. Hereby, we have
a closer look at prediction when network traffic flow is high.
Figure 5 shows that at 1:30 pm and 3:45 pm, there is low and
high network traffic, respectively, and error increases in these
fluctuations; however, in these situations TCN model follows
the traffic trend.
The interarrival packet time is an important feature as
it shows that packets are received in regular intervals in the
case of normal traffic, whereas in the case of congestion, the
reception of packets has almost zero interarrival time. The
first derivative, d T
dt
∆ , and second the derivative
2
2
d T
dt
∆
, of
the ( )
T
∆ , capitalize the difference between normal traffic and
congestion. We have also predicted the interarrival time of
the packets at the gateway. Figure 6 shows prediction of the
average interarrival time of the packets on 15th
March 2021.
The MAE, MRE, and forecasting accuracy in (%) of the Packet
Flow/Sec with the different models (TCN, GRU, LSTM, SAE,
and CNN-LSTM) are shown in Table 5. The results show
that TCN has a better forecasting accuracy than other deep
Figure 5. Observation of traffic flow using TCN.
learning-based models for predicting IoT network traffic. It’s
worth mentioning that the TCN model, when combined with
Taguchi hyper-parameter optimization, achieves a forecasting
precision of approximately 95 %, which is 15 % higher than
the LSTM deep learning model and 10 % to 12 % higher than
CNN-LSTM, SAE, and GRU.
Furthermore, we claim that the MAE and MRE values
for the TCN model are lower than other deep learning models,
implying that the TCN model’s forecasting accuracy is superior
to that of other models. The MAE value of the TCN model is
7.4237, which is much less in comparison to LSTM’s MAE
value of 28.6085 and CNN-LSTM’s MAE value of 26.5688. In
the case of MRE, the TCN model outperforms and has a lower
value of 0.0457, which is one-fourth of the LSTM and GRU
models, as shown in Table 5.
Algorithm MAE MRE Forecasting
accuracy
TCN 7.4237 0.0457 95.52%
LSTM 28.6085 0.1901 80.46%
GRU 35.0772 0.1821 82.98%
SAE 34.4004 0.1680 83.29%
CNN-LSTM 26.5688 0.1401 84.98%
Table 5. Performance evaluation of TCN with other deep
learning models
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
821
We show the output of the various deep learning models,
including LSTM, SAE, GRU, and CNN-LSTM, along with the
TCN model shown in Fig. 7. These models show the traffic
forecasting for the nth
day. The blue and red lines show the
actual and predicted traffic flow.
to enhance the structure of TCN which improves traffic flow
forecasting. The other possibilities offered by deep learning
models to detect and predict congestion in IoT network is
thoroughly examined in this chapter. With the real network
traffic, we evaluated TCN with LSTM, GRU, SAE, and CNN-
LSTM and observed that TCN achieves better forecasting
results. It demonstrates that TCN can efficiently predict and
detect congestion in IoT networks. In the future, efforts will be
made to improve the accuracy of the TCN model by focusing
on inevitable conditions such as random bursts.
References
1. Thompson, W.L. & Talley, M.F. Deep learning for iot
communications. 53rd
Annual Conference on Information
Sciences and Systems (CISS), 2019, pp. 1-4.
doi: 10.1109/CISS.2019.8693025
2. Al-Kashoash,Hayder&Ahmed,Abdulmohsin.Congestion
controlfor6LoWPANwirelesssensornetworks:Towardthe
internet of things. University of Leeds, 2017. PhD Thesis.
doi: 10.1007/978-3-030-17732-4
3. Wang, M.; Cui, Y.; Wang, X.; Xiao, S. & Jiang J. Machine
learning for networking: Workflow, advances and
opportunities. IEEE Network, 2017, 32(2), 92-99.
doi: 10.1109/MNET.2017.1700200
4. Seo, J.; Lee, S.; Khan, M.T.R. & Kim, D. A new coap
congestion control scheme considering strong and weak
rtt for IoUT. In Proceedings of the 35th
Annual ACM
symposium on applied computing, 2020, pp. 2158–2162.
doi: 10.1145/3341105.3373981
5. Ibrahim; Amin S.; Khaled, Y. Youssef; Ahmed, H.
Eldeeb; Mohamed, Abouelatta & Hesham, Kamel.
Adaptive aggregation based IoT traffic patterns for
optimizing smart city network performance. Alexandria
Eng. J., 2022, 61(12), pp. 9553-9568.
doi: 10.1016/j.aej.2022.03.037
6. Ancillotti, E. & Bruno, R. Comparison of coap and cocoa+
congestioncontrolmechanismsfordifferentiotapplication
scenarios. In 2017 IEEE Symposium on Computers and
Communications (ISCC), 2017, pp. 1186–1192.
doi: 10.1109/ISCC.2017.8024686
7. Xiao, K.; Mao, S. & Tugnait, J.K. Tcp-drinc: Smart
congestion control based on deep reinforcement learning.
IEEE Access, 2019, 7, pp. 11892-11904.
doi:10.1109/ACCESS.2019.2892046
8. Jiang, H.; Li, Q.; Jiang, Y.; Shen, G.; Sinnott, R.; Tian,
C. & Xu, M. When machine learning meets congestion
control: A survey and comparison. Comput. Networks,
2021, 192, 108033.
doi: 10.1016/j.comnet.2021.108033
9. Zhou, K.; Wang, W.; Hu, T. & Deng, K. Time series
forecasting and classification models based on recurrent
with attention mechanism and generative adversarial
networks. Sens., 2020, 20(24), 7211.
doi: 10.3390/s20247211
10. Sander, Constantin; Jan, Rüth; Oliver, Hohlfeld & Klaus,
Wehrle. Deepcci: Deep learning-based passive congestion
Figure 6. Observation of inter-arrival time using TCN.
Figure 7 shows that forecasting the average flow in packet/
sec from theTCN model yields improved results than other deep
learning models. We measure the time taken to train the model
and also prediction time required by the model. Integration
of the Taguchi method with TCN took approximately an
hour to train the model for optimizing the configuration of
hyperparameters. TCN model takes less than one second to
forecast the IoT network traffic flow subject to giving all past
input sequential data. To conclude, the TCN model achieves the
highest accuracy and very low forecasting difference between
ground truth and predicted value than LSTM and GRU models
for IoT network traffic flow prediction. Therefore, TCN based
approach is effective and promising for predicting real IoT
network traffic.
Figure 7. Forecasting of IoT network traffic with different deep
learning models.
6. conclusion
This article contributes to meliorating the existing
alternatives and potentialities of congestion detection and
prediction techniques in present network monitoring systems,
mainly focused on IoT networks, where detection and
prediction of congestion in the network are highly desirable.
In this paper, in contrast to state-of-the-art, we use the deep
learning-based TCN model. To the best of our knowledge, we
are pioneers in applying the TCN for congestion detection and
prediction. Furthermore, we incorporate the Taguchi method
Def. SCI. J., Vol. 72, No. 6, november 2022
822
control identification. In Proceedings of the 2019
workshop on network meets AI & ML, 2019, pp. 37-43.
doi: 10.1145/3341216.3342211
11. Doshi, R.; Apthorpe, N. & Feamster, N. Machine learning
ddos detection for consumer internet of things devices.
In 2018 IEEE Security and Privacy Workshops (SPW),
2018, 29-35 .
doi: 10.1109/SPW.2018.00013
12. Sivanathan, A.; Gharakheili, H.; Loi, F.; Radford,
A.; Wijenayake, C.; Vishwanath, A. & Sivaraman, V.
Classifying iot devices in smart environments using
network traffic characteristics. IEEE Transactions on
Mobile Computing, 2018, 18 (8), 1745-1759.
doi:10.1109/TMC.2018.2866249
13. Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A.
& Lloret, J. Network traffic classifier with convolutional
and recurrent neural networks for internet of things. IEEE
Access, 2017, 5, pp. 18042-18050.
doi: 10.1109/ACCESS.2017.2747560
14. Falahatraftar; Farnoush; Pierre, Samuel & Chamberland,
Steven. An intelligent congestion avoidance mechanism
based on generalized regression neural network for
heterogeneous vehicular networks, IEEE Transactions on
Intelligent Vehicles, 2022, 1-13.
15. Demir, A.K. & Abut, F. mlcocoa: a machine learning-
based congestion control for coap. Turkish J. of Electrical
Eng. & Comput. sci., 2020, 28(5).
doi:10.3906/elk-2003-17
16. Betzler, A.; Gomez, C.; Demirkol, I. & Paradells, J.
Cocoa+: An advanced congestion control mechanism for
coap. Ad Hoc Networks, 2015, 33, 126-139.
doi: 10.1016/j.adhoc.2015.04.007
17. Najm, I.A.; Hamoud, A.K.; Lloret, J & Bosch, I. Machine
learning prediction approach to enhance congestion
control in 5g iot environment. Electron., 2019, 8(6),
607.
doi: 10.3390/electronics8060607
18. Wang, L.; Guo, S.; Huang, W. & Qiao, Y. Places205-
vggnet models for scene recognition. arXiv preprint
arXiv: 1508.01667, 2015.
doi: 10.48550/arXiv.1508.01667
19. Gong, S.; Naseer, U & Benson, T.A. Inspector gadget:
A framework for inferring tcp congestion control
algorithms and protocol configurations. In Network Traffic
Measurement and Analysis Conference, 2020.
doi:978-3-903176-27-0 ©2020 IFIP
20. Francois, Lavet V.; Henderson, P.; Islam, R.; Bellemare,
M.G. & Pineau, J. An introduction to deep reinforcement
learning. arXiv preprint arXiv: 1811.12560, 2018.
doi: https://doi.org/10.1561/2200000071
21. Liu, J.; Han, Z & Li, W. Performance analysis of tcp new
reno over satellite dvb-rcs2 random access links. IEEE
Transactions on Wireless Communications, 2019, 19(1),
435-446.
doi:10.1109/TWC.2019.2945952
22. Edwan, Talal A.; Iain, W. Phillips; Lin, Guan;
Jon, Crowcroft; Ashraf, Tahat & Bashar, EA Badr.
Revisiting legacy high-speed TCP congestion control
variants: An optimisation-theoretic analysis of multi-
mode TCP. Simulation Modelling Practice and Theory,
118, 2022.
doi: 10.1016/j.simpat.2022.102542
23. Zong, Liang; Han Wang & Gaofeng Luo. Transmission
Control Over Satellite Network forMarine Environmental
Monitoring System. IEEE Transactions on Intelligent
Transportation Systems, 2022, 23(10), 19668-19675. doi:
10.1109/TITS.2022.3145881
24. Chowdhury, T. & Alam, M.J. Performance evaluation of
tcp vegas over tcp reno and tcp newreno over tcp reno,
JOIV. Int. J. on Inf. Visualization, 2019, 3(3), 275-282.
doi:10.30630/joiv.3.3.270
25. Chen, P. & Yu, Q. Application of improved tcp-illinois
algorithm in wireless network. Comp. Eng., 2019, 04.
doi:10.1145/1190095.1190166
26. Xu, Z.; Zhang, Y.; Lin, X., & Wu, Y. Design of smart
home, router based on openwrt and zigbee. Comp. Eng.,
43(3) 2017, 94-98.
doi:10.1109/CCSSE.2018.8724745
27. Yu, X.; Sun, L.; Yan, Y & Liu, G. A short-term traffic flow
prediction method based on spatial temporal correlation
using edge computing. Comp. & Electr. Eng., 2021, 93,
107-219.
doi: 10.1016/j.compeleceng.2021.107219
28. Lopez-Martin, M.; Carro, B. & Sanchez-Esguevillas, A.
IoT type-of-traffic forecasting method based on gradient
boostingneuralnetworks.FutureGenerationComput.Sys.,
2020, 105, 331-345. doi:10.1016/j.future.2019.12.013
29. Milo, T. & Somech, A. Automating exploratory
data analysis via machine learning: An overview. In
Proceedings of the 2020 ACM SIGMOD International
Conference on Management of Data, 2020, pp. 2617-
2622.
doi:10.1145/3318464.3383126
30. Choi, Jungjun & Xiye, Yang. Asymptotic properties of
correlation-based principal component analysis. J. of
Econometrics, 2022, 229(1), 1-18.
doi: org/10.1016/j.jeconom.2021.08.003
31. Benesty, J.; Chen, J.; Huang, Y. & Cohen I. Pearson
correlation coefficient. In Noise reduction in speech
processing, Springer, Berlin, Heidelberg, 2009, pp. 37-
40.
doi: 10.1007/978-3-642-00296-0_5
32. Pargent, F.; Pfisterer, F.; Thomas, J. & Bischl, B. .
Regularized target encoding outperforms traditional
methods in supervised machine learning with high
cardinality features. arXiv preprint arXiv:2104.00629,
2021.
doi:10.1007/s00180-022-01207-6
33. Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A. & Hager, G.D.
Temporal convolutional networks for action segmentation
and detection, In proceedings of the IEEE Conference on
Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network
823
Computer Vision and Pattern Recognition, 2017, 156-
165.
doi:10.48550/arXiv.1611.05267
34. Zhao, W.; Gao, Y.; Ji, T.; Wan, X.; Ye, F. & Bai, G. Deep
temporal convolutional networks for short-term traffic
flow forecasting. IEEE Access, 2019, 7, 114496-114507.
		 doi:10.1109/ACCESS.2019.2935504
CONTRIBUTORS
Dr Vinesh Kumar Jain completed PhD in the IoT domain from
Malaviya National Institute of Technology, Jaipur, India in the
department of Computer Science and Engineering. He has published
and communicated many papers in SCI Journals. He has more than 18
year experience in teaching and Research. He is teaching theAutomata
Theory, Compiler Design, and Networks. He has experience managing
smart devices as well as designing and developing IoT networks. He
has also worked in the fields of malware analysis and security. He is a
member of consultancy committee for the Automated Driver Testing
Systems RDTC & SIGAWALAjmer- Rajasthan.
He designed and developed the proposed approach along with
implementation for this study.
Dr Arka Prokash Mazumdar received his PhD degree from Indian
Institute of Technology, Patna. He is Assistant Professor at the
Malaviya National Institute of Technology in department of Computer
Science and Engineering. He is coauthor of 40+ technical contributions
including papers published in journals and conferences. His research
interests focus mainly on the Internet of Things.
He helped in the creating the Home IoT network for this study.
Dr Mahesh Chandra Govil was a full time professor with the
Department of Computer Science and Engineering, Malaviya National
Institute of Technology Jaipur, India. He is currently the Director of
NIT Sikkim. His areas of interest include real time systems, parallel
and distributed systems, fault tolerant systems, object detection, and
cloud computing.
He gave the technical review and improved the writing of the
manuscript.

More Related Content

Similar to Congestion Prediction in Internet of Things Network using Temporal Convolutional-1.pdf (20)

Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Shakas Technologies
 
Improving the performance of IoT devices that use Wi-Fi
Improving the performance of IoT devices that use Wi-FiImproving the performance of IoT devices that use Wi-Fi
Improving the performance of IoT devices that use Wi-Fi
International Journal of Reconfigurable and Embedded Systems
 
An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...
IJECEIAES
 
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL FUZZY MODELS ON A CLOUD-...
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL  FUZZY MODELS ON A CLOUD-...INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL  FUZZY MODELS ON A CLOUD-...
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL FUZZY MODELS ON A CLOUD-...
pmaheswariopenventio
 
Deep learning algorithms for intrusion detection systems in internet of thin...
Deep learning algorithms for intrusion detection systems in  internet of thin...Deep learning algorithms for intrusion detection systems in  internet of thin...
Deep learning algorithms for intrusion detection systems in internet of thin...
IJECEIAES
 
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IJCNCJournal
 
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IJCNCJournal
 
Insights on critical energy efficiency approaches in internet-ofthings applic...
Insights on critical energy efficiency approaches in internet-ofthings applic...Insights on critical energy efficiency approaches in internet-ofthings applic...
Insights on critical energy efficiency approaches in internet-ofthings applic...
IJECEIAES
 
Classification of Software Defined Network Traffic to provide Quality of Service
Classification of Software Defined Network Traffic to provide Quality of ServiceClassification of Software Defined Network Traffic to provide Quality of Service
Classification of Software Defined Network Traffic to provide Quality of Service
IRJET Journal
 
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
yaswanthmamidisetty
 
Futuristic Research Trends and Applications of Internet of Things 1st Edition...
Futuristic Research Trends and Applications of Internet of Things 1st Edition...Futuristic Research Trends and Applications of Internet of Things 1st Edition...
Futuristic Research Trends and Applications of Internet of Things 1st Edition...
lumajwornerb3
 
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
IJCNCJournal
 
Priority based flow control protocol for internet of things built on light f...
Priority based flow control protocol for internet of things built  on light f...Priority based flow control protocol for internet of things built  on light f...
Priority based flow control protocol for internet of things built on light f...
IJECEIAES
 
Survey on Optimization of IoT Routing Based On Machine Learning Techniques
Survey on Optimization of IoT Routing Based On Machine Learning TechniquesSurvey on Optimization of IoT Routing Based On Machine Learning Techniques
Survey on Optimization of IoT Routing Based On Machine Learning Techniques
IRJET Journal
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series Methods
Ajay Ohri
 
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion DetectionEnsemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
IJCNCJournal
 
list of references.docx
list of references.docxlist of references.docx
list of references.docx
Sami Siddiqui
 
list of references.pdf
list of references.pdflist of references.pdf
list of references.pdf
Sami Siddiqui
 
Machine Learning Meets Communication Networks.pptx
Machine Learning Meets Communication Networks.pptxMachine Learning Meets Communication Networks.pptx
Machine Learning Meets Communication Networks.pptx
RaghuNayak15
 
IoTwlcHITnewSlideshare.pptx
IoTwlcHITnewSlideshare.pptxIoTwlcHITnewSlideshare.pptx
IoTwlcHITnewSlideshare.pptx
National Institute of Technology (Regional Engineering College ) (REC) (NIT) - Durgapur
 
Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Detecting_and_Mitigating_Botnet_Attacks_in_Software-Defined_Networks_Using_De...
Shakas Technologies
 
An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...An efficient security framework for intrusion detection and prevention in int...
An efficient security framework for intrusion detection and prevention in int...
IJECEIAES
 
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL FUZZY MODELS ON A CLOUD-...
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL  FUZZY MODELS ON A CLOUD-...INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL  FUZZY MODELS ON A CLOUD-...
INNOVATIVE DIAGNOSIS VIA HYBRID LEARNING AND NEURAL FUZZY MODELS ON A CLOUD-...
pmaheswariopenventio
 
Deep learning algorithms for intrusion detection systems in internet of thin...
Deep learning algorithms for intrusion detection systems in  internet of thin...Deep learning algorithms for intrusion detection systems in  internet of thin...
Deep learning algorithms for intrusion detection systems in internet of thin...
IJECEIAES
 
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IJCNCJournal
 
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IJCNCJournal
 
Insights on critical energy efficiency approaches in internet-ofthings applic...
Insights on critical energy efficiency approaches in internet-ofthings applic...Insights on critical energy efficiency approaches in internet-ofthings applic...
Insights on critical energy efficiency approaches in internet-ofthings applic...
IJECEIAES
 
Classification of Software Defined Network Traffic to provide Quality of Service
Classification of Software Defined Network Traffic to provide Quality of ServiceClassification of Software Defined Network Traffic to provide Quality of Service
Classification of Software Defined Network Traffic to provide Quality of Service
IRJET Journal
 
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
Network Traffic Prediction Model Considering Road Traffic Parameters Using Ar...
yaswanthmamidisetty
 
Futuristic Research Trends and Applications of Internet of Things 1st Edition...
Futuristic Research Trends and Applications of Internet of Things 1st Edition...Futuristic Research Trends and Applications of Internet of Things 1st Edition...
Futuristic Research Trends and Applications of Internet of Things 1st Edition...
lumajwornerb3
 
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
2021 - 2022 : Top 10 Read articles in Computer Networks and Communication
IJCNCJournal
 
Priority based flow control protocol for internet of things built on light f...
Priority based flow control protocol for internet of things built  on light f...Priority based flow control protocol for internet of things built  on light f...
Priority based flow control protocol for internet of things built on light f...
IJECEIAES
 
Survey on Optimization of IoT Routing Based On Machine Learning Techniques
Survey on Optimization of IoT Routing Based On Machine Learning TechniquesSurvey on Optimization of IoT Routing Based On Machine Learning Techniques
Survey on Optimization of IoT Routing Based On Machine Learning Techniques
IRJET Journal
 
Internet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series MethodsInternet Traffic Forecasting using Time Series Methods
Internet Traffic Forecasting using Time Series Methods
Ajay Ohri
 
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion DetectionEnsemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
Ensemble of Probabilistic Learning Networks for IoT Edge Intrusion Detection
IJCNCJournal
 
list of references.docx
list of references.docxlist of references.docx
list of references.docx
Sami Siddiqui
 
list of references.pdf
list of references.pdflist of references.pdf
list of references.pdf
Sami Siddiqui
 
Machine Learning Meets Communication Networks.pptx
Machine Learning Meets Communication Networks.pptxMachine Learning Meets Communication Networks.pptx
Machine Learning Meets Communication Networks.pptx
RaghuNayak15
 

Recently uploaded (20)

microcontroller AHB Protocol presentation
microcontroller AHB Protocol presentationmicrocontroller AHB Protocol presentation
microcontroller AHB Protocol presentation
manohemanth1
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITSDE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
Sridhar191373
 
Fresh concrete Workability Measurement
Fresh concrete  Workability  MeasurementFresh concrete  Workability  Measurement
Fresh concrete Workability Measurement
SasiVarman5
 
2. CT M35 Grade Concrete Mix design ppt.pdf
2. CT M35 Grade Concrete Mix design  ppt.pdf2. CT M35 Grade Concrete Mix design  ppt.pdf
2. CT M35 Grade Concrete Mix design ppt.pdf
smghumare
 
Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine, Issue 53 / Spring 2025Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine
 
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITSDIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
Sridhar191373
 
Prediction of Unconfined Compressive Strength of Expansive Soil Amended with ...
Prediction of Unconfined Compressive Strength of Expansive Soil Amended with ...Prediction of Unconfined Compressive Strength of Expansive Soil Amended with ...
Prediction of Unconfined Compressive Strength of Expansive Soil Amended with ...
Journal of Soft Computing in Civil Engineering
 
Dr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivu__Machine Learning-Module 3.pdfDr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivashankar
 
Video Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptxVideo Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptx
HadiBadri1
 
Software Engineering Unit 2 Power Point Presentation AKTU University
Software Engineering Unit 2 Power Point Presentation AKTU UniversitySoftware Engineering Unit 2 Power Point Presentation AKTU University
Software Engineering Unit 2 Power Point Presentation AKTU University
utkarshpandey8299
 
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdfKevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Medicoz Clinic
 
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdfSoftware_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
VanshMunjal7
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
Main Menu The metals-black-book-ferrous-metals
Main Menu The metals-black-book-ferrous-metalsMain Menu The metals-black-book-ferrous-metals
Main Menu The metals-black-book-ferrous-metals
Ricardo Akerman
 
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCHUNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
Sridhar191373
 
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdfISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
ISO 4548-7 Filter Vibration Fatigue Test Rig Catalogue.pdf
FILTRATION ENGINEERING & CUNSULTANT
 
Introduction to Machine Vision by Cognex
Introduction to Machine Vision by CognexIntroduction to Machine Vision by Cognex
Introduction to Machine Vision by Cognex
RicardoCunha203173
 
UNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and ControlUNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and Control
Sridhar191373
 
microcontroller AHB Protocol presentation
microcontroller AHB Protocol presentationmicrocontroller AHB Protocol presentation
microcontroller AHB Protocol presentation
manohemanth1
 
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDINGMODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
MODULE 5 BUILDING PLANNING AND DESIGN SY BTECH ACOUSTICS SYSTEM IN BUILDING
Dr. BASWESHWAR JIRWANKAR
 
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITSDE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
Sridhar191373
 
Fresh concrete Workability Measurement
Fresh concrete  Workability  MeasurementFresh concrete  Workability  Measurement
Fresh concrete Workability Measurement
SasiVarman5
 
2. CT M35 Grade Concrete Mix design ppt.pdf
2. CT M35 Grade Concrete Mix design  ppt.pdf2. CT M35 Grade Concrete Mix design  ppt.pdf
2. CT M35 Grade Concrete Mix design ppt.pdf
smghumare
 
Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine, Issue 53 / Spring 2025Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine, Issue 53 / Spring 2025
Better Builder Magazine
 
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITSDIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
DIGITAL ELECTRONICS: UNIT-III SYNCHRONOUS SEQUENTIAL CIRCUITS
Sridhar191373
 
Dr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivu__Machine Learning-Module 3.pdfDr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivu__Machine Learning-Module 3.pdf
Dr. Shivashankar
 
Video Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptxVideo Games and Artificial-Realities.pptx
Video Games and Artificial-Realities.pptx
HadiBadri1
 
Software Engineering Unit 2 Power Point Presentation AKTU University
Software Engineering Unit 2 Power Point Presentation AKTU UniversitySoftware Engineering Unit 2 Power Point Presentation AKTU University
Software Engineering Unit 2 Power Point Presentation AKTU University
utkarshpandey8299
 
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdfKevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Kevin Corke Spouse Revealed A Deep Dive Into His Private Life.pdf
Medicoz Clinic
 
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdfSoftware_Engineering_in_6_Hours_lyst1728638742594.pdf
Software_Engineering_in_6_Hours_lyst1728638742594.pdf
VanshMunjal7
 
world subdivision.pdf...................
world subdivision.pdf...................world subdivision.pdf...................
world subdivision.pdf...................
bmmederos12
 
Main Menu The metals-black-book-ferrous-metals
Main Menu The metals-black-book-ferrous-metalsMain Menu The metals-black-book-ferrous-metals
Main Menu The metals-black-book-ferrous-metals
Ricardo Akerman
 
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCHUNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
UNIT-4-PPT UNIT COMMITMENT AND ECONOMIC DISPATCH
Sridhar191373
 
Introduction to Machine Vision by Cognex
Introduction to Machine Vision by CognexIntroduction to Machine Vision by Cognex
Introduction to Machine Vision by Cognex
RicardoCunha203173
 
UNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and ControlUNIT-1-PPT-Introduction about Power System Operation and Control
UNIT-1-PPT-Introduction about Power System Operation and Control
Sridhar191373
 

Congestion Prediction in Internet of Things Network using Temporal Convolutional-1.pdf

  • 1. 810 Congestion Prediction in Internet of Things Network using Temporal Convolutional Network: A Centralized Approach Vinesh Kumar Jain,# Arka Prokash Mazumdar$ and Mahesh Chandra Govil% # Engineering College Ajmer, Ajmer, Rajasthan - 305 025, India $ Malaviya National Institute of Technology, Jaipur, Rajasthan - 302 017, India % National Institute of Technology, Sikkim - 737 139, India E-mail: [email protected] Abstract The unprecedented ballooning of network traffic flow, specifically the Internet of Things (IoT) network traffic, has big stress of congestion on today’s Internet. Non-recurring network traffic flow may be caused by temporary disruptions, such as packet drop, poor quality of services, delay, etc. Hence, network traffic flow estimation is important in IoT networks to predict congestion. As the data in IoT networks is collected from a large number of diversified devices with unlike formats of data and manifest complex correlations, the generated data is heterogeneous and nonlinear. Conventional machine learning approaches are unable to deal with nonlinear datasets and suffer from the misclassification of real network traffic due to overfitting. Therefore, it also becomes hard for conventional machine learning tools like shallow neural networks to predict congestion accurately. The accuracy of congestion prediction algorithms plays an important role in controlling congestion by regulating the send rate of the source. Various deep learning methods, such as LSTM, CNN, and GRU, are considered in designing network traffic flow predictors, which have shown promising results. This work proposes a novel congestion predictor for IoT that uses TCN. Furthermore, we use the Taguchi method to optimize the TCN model which reduces the number of experiment runs. We compare TCN with the other four deep learning-based models concerning Mean Absolute Error (MAE) and Mean Relative Error (MRE). The experimental results show that the TCN-based deep learning framework achieves improved performance with 95.52% accuracy in predicting network congestion. Further, we design the Home IoT network testbed to capture the real network traffic flows as no standard dataset is available. Keywords: IoT; TCN; Congestion; Taguchi; Prediction; Classification Received : 31 August 2021, Revised : 26 October 2022 Accepted : 14 November 2022, Online published : 6 December 2022 1. Introduction The use of Internet-connected smart devices, otherwise called the Internet of Things (IoT), is increasing rapidly in our homes, offices, smart cities, and many other places. These IoT devices are resource constrained, and a continuous rise in the volume of network traffic may lead to network congestion.1 A high traffic volume in IoT networks leads to extra communication delay, low network throughput, and waste of computing resources and bandwidth.2 These reasons ultimately become the cause of congestion. Diagnosing network congestion and building a framework that can classify and predict it is one of the most important issues3 as it affects the decision of the routing path process and optimal bandwidth utilization. Therefore, for effective communication in the network, it is essential to classify and predict congestion and take appropriate preventive measures accordingly. Hence, research for the classification and prediction of congestion in the network, especially for IoT networks, is extremely important. Typically, state-of-the-art congestion prediction in IoT networks relies on conventional techniques4–6 as used in Wireless Sensor Networks (WSN). These research works are focused only on the computing revolution, including the usage of high-end computers and advanced algorithms for managing traffic and performing flow prediction. Few efforts have been made to forecast short-term traffic flow, including simulation techniques,7 mathematical models, statistical approaches, and regression techniques. The data created by IoT networks are heterogeneous since it is gathered from numerous diverse devices with different data formats and complex correlations. As a result, it also becomes very challenging for traditional machine learning technologies, such as shallow neural networks, to effectively forecast congestion. Machine learning methods attain poor performance when dealing with high dimensional state spaces in congestion control problems, and their performance does not improve with increasing data. In particular, it is expensive to measure the traffic volume when there is high traffic on the IoT network. The traditional feature selection techniques of machine learning mainly consider the port and payload as features. Reusability and unfixed assignment of port numbers make it less effective to choose a port as a dominant feature in traffic classification. In contrast, payload as a feature fails due to encrypted traffic packet inspection. The problem in IoT Defence Science Journal, Vol. 72, No. 6, November 2022, pp. 810-823, DOI : 10.14429/dsj.72.17447 © 2022, DESIDOC
  • 2. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 811 networks, like congestion prediction, is an adversarial problem due to random burst traffic and heterogeneity in devices and protocols. In such cases, machine learning techniques that learn from historical data can make the wrong prediction. Recently, statistical feature-based machine learning techniques have received attention, particularly in the field of wired and wireless networking.3,8 However, adopting machine learning as an invincible approach and using it for real-time IoT network traffic is not good. In contrast, deep learning is a centralized approach that uses basic information as direct input and automatically performs feature extraction. Deep learning eliminates the need for experts in a specific area by utilizing hierarchical feature extraction, which makes information distillation efficient, provides more abstract correlations in the data, and reduces the pre-processing effort. Deep learning algorithms can proficiently comprehend the features of a large amount of data. Parallel computation using graphics processing units (GPUs) allows deep learning to draw inferences in milliseconds. This allows for high-accuracy and timely network analysis and control, eliminating the run- time constraints of conventional mathematical approaches (e.g., convex optimization, game theory). The prediction of congestion in IoT networks has a temporal dependency. The temporal dependency means predicting congestion or a high traffic volume at a specific time instance is done based on past observations. Real-time prediction of network congestion requires continuous feeding and learning. The number of time slots may increase rapidly with time, which results in high computational complexity and also affects prediction accuracy. Recurrent Neural networks (RNN) and Long Short Term Memory (LSTM) are the most common deep learning methods to analyze time-series data. However, RNN models cannot directly calculate the long sequence of time points. In contrast, LSTM can process millions of time points, but it works sequentially, making it inefficient in generating timely responses. On the contrary, Temporal Convolutional Network (TCN), a deep learning-based model, has a parallel processing capability and training and forecasting time lower than LSTM.9 Therefore, in this work, we use the TCN model to predict congestion accurately. Our proposal also leverages the dropout technique of deep learning to cope with the overlearning problem. The proposed approach works for the real IoT network traffic for congestion prediction. In addition to state-of-the-art congestion control techniques in IoT networks, our proposal is a centralized approach toward congestion control that reduces the burden of end devices on congestion prediction. In summary, the following are the main contributions of our proposed approach. • We created a Home IoT network testbed and gathered the network traffic flows of various IoT applications because, as far as we are aware, there is no such labeled dataset of IoT network traffic available. • The traditional approaches use the brute force method to tune the hyperparameters and structure of deep learning models. Brute force approaches take a long time to train the model and are inefficient for achieving high prediction accuracy. The proposed approach uses a design of the experiment (DoE) based Taguchi method as a novel way to achieve the optimal structure of TCN to resolve this issue for IoT network traffic prediction. Compared with brute force approaches to tune and train the models, the Taguchi method decreases the number of trials required to tune hyperparameters and models, which results in the accuracy of forecasting results. • IoT networks have heterogeneous devices and irregular traffic flow. In such cases, when we train the model, it may be overfitted, and we do not have an accurate prediction as desired. We use the concept of regularization by applying the dropout layer to avoid overfitting. • We use a deep learning-based imputation technique to deal with missing values generated during data collection. This improves overall accuracy by converting incomplete data to complete data. • To the best of our knowledge, TCN has not previously been applied to congestion prediction and detection problems. Consequently, the work discussed in this paper is original by its very nature. The contour of the paper is as follows. Section 2 presents state-of-the-art work in the field of IoT network traffic classification and prediction. Section 3 formulates IoT network traffic problems and emphasizes the necessity of deep networks for IoT network traffic classification and prediction along with the proposed methodology. Section 4 represents the structure of TCN and hyper-parameter tuning by the Taguchi Method. In Section 5, empirical results of deep learning models about the accuracy of prediction and the role of activation functions in enhancing real-time prediction accuracy are presented. The conclusion and future scope of this work are presented in Section 6. 2. RELATED WORK Most of state-of-the-art in the area of IoT networks is related to IoT congestion control,7,10 classification of network attacks based on the traffic,11 IoT device classification,12 and traffic classification13 incorporating machine learning and deep learning. In this section, we introduce and analyze state-of-the- art. Heterogeneous vehicular networks may experience congestion due to increased resource demands for data transmission as a result of the rise in intelligent vehicle adoption and the popularity of various safety and comfort applications. Safety applications, including emergency braking systems, traffic danger alerts, and collision avoidance signals, require low latency and high bandwidth. Unfortunately, congestion often reduces network performance and customer pleasure (QoS) by reducing the quality of service. Reduced QoS is also a result of inadequate mobility models, routing protocols, and communication mediums. Falahatraftar, et al.14 proposed a congestion prediction and avoidance approach named Intelligent Congestion Avoidance Mechanism (ICAM) based on the Generalized Regression Neural Network (GRNN). The authors compared the proposed GRNN congestion prediction
  • 3. Def. SCI. J., Vol. 72, No. 6, november 2022 812 model with other widely used models, including Multiple Linear Regression (MLR), Support Vector Machine (SVM) for Regression, Decision Tree Regression (DTR), and Multi-layer Perceptron for Regression (MLPR). The proposed GRNN congestion prediction model performs than other models in terms of accuracy, dependability, and stability, according to numerical data. In addition, simulation findings demonstrate a significant network performance gain in terms of packet delivery ratio, average latency, and packet loss ratio when compared to conventional congestion control strategies. Demir, et.al15 proposed mlCoCoA, a machine learning- based enhancement in CoCoA [16], the advanced variant of the Constrained Application Protocol (CoAP). CoCoA is used to control congestion by estimating the Re-transmission Timeout (RTO) values, as shown in equation 1. RTT is the round trip time, and RTTVAR gives the difference between the successive RTT and current RTT measurements. Here, x is a variable that denotesthestrongandweakRTT.StrongRTTiscalculatedwhen an acknowledgment is received without any retransmission. Weak RTT is calculated when an acknowledgment is received after the number of retransmission. Here α, β, λ, and K are the constants, and their values are defined statically. x x x x new| RTTVAR (1 )*RTTVAR *| RTT RTT − = −β +β − x x x new| RTT (1 )*RTT *RTT − = − α + α x x x x RTO SRTT k *RTTVAR = + overall x overall RTO *RTO (1 )*RTO = λ + − λ (1) TheapproachmlCoCoAclaimsthatthedynamicprediction of the values of these constants using machine learning can increase the network throughput of the IoT network. The authors performed the set of experiments by creating the CoCoA network for collecting the ground truthα, β, λ, and K. The SVM predicts the values of these constants. The precise estimation of RTO provided by the accurate prediction of these values reduces unnecessary retransmission. Sander,10 et al. proposed DeePCCI to identify congestion based on packet arrival time (passive entity) in traffic flow. Being based on the usage of passive knowledge, it works on encrypted transport headers also. DeePCCI architecture is the combination of CNN and LSTM, which are parts of a Deep Neural Network (DNN).The only feature taken is packet arrival time, whose histogram bins of size 1 ms are generated and then fed into the DNN and extracted through 2D VGGNet-13.18 The authors used the single host, multiple hosts, and mininet- based network testbeds by using the standard Linux 4.18 Kernel to generate the network traffic. The authors evaluated the performance on various vantage points after and before the bottleneck giving the final accuracy of the chosen network in BBR, CUBIC, RENO, CUBIC-p, and RENO-p. The authors19 analyzed the performance of DeepCCI and showed that it has limited performance because the training is done using testbed- generated data that lacks the Internet’s inherent noise. Further, they observe that DeepCCI needs to be re-trained for different kernels to capture behavioral differences appropriately. Xiao,7 et al.proposed TCP-Drinc for smartly controlling congestion for the TCP variants. The authors use Deep Reinforcement Learning20 which learns from experience. Congestion is controlled by adjusting the window size. The authorsusedthedeepconvolutionsneuralnetworkandextracted stable features from abundant but noisy data measurements. TCP-Drinc is compared with the various versions of TCP, i.e., TCP-New Reno,21 TCPCubic,22 TCP-Hybla,23 TCP-Vegas,24 and TCP-Illinois.25 TCP-Drinc yields maximum throughput and the second lowest Round Trip Time for the entire period of propagation delay. TCP-Drinc has a higher round trip time than TCP-Vegas. Najm,17 et al.proposed a novel machine learning-based model built upon a decision tree algorithm for predicting congestion in 5G IoT networks. The authors determine the optimal congestion window using specific network conditions, including high throughput, high congestion window, large queue size, and low loss. The authors applied C4.5, RepTree, and Random Tree-based decision tree approaches along with clustering and stacking approaches. The results show that the C4.5decisiontreealgorithmhasthebestperformancecompared to other machine learning algorithms; The authors present a tree-based graph that suggests the optimal path through which congestion can be controlled. The authors compared their model with the original Stream Control Transmission Protocol (SCTP) and claimed a 14.5 % improvement in performance. Doshi,11 et al.worked on detecting distributed denial of service (DDoS) attacks using well-known machine learning algorithms in the IoT network. A large number of Botnets like Mirai leveraged vulnerable IoT devices for performing DDoS attacks in IoT networks. The network includes IoT devices such as home gateway routers and other middleboxes for keeping track of traffic flow between IoT devices on the local network and the rest of the Internet. Binary classification is applied to separate the data for both DDoS attacks as 0 and 1. Data is collected from the devices, such as home cameras and Bluetooth devices connected to blood pressure monitors, all connected to the router via Raspberry Pi. Non-IoT traffic is filtered out from the pcap files. Sivanathan,12 et al. proposed a robust approach for classifying IoT devices using traffic characteristics that are acquired at the network level. A setup with 28 IoT devices and a few non-IoT devices was created to validate the proposed approach. OpenWrt was used to monitor the traffic flow for 26 weeks.26 Each of the devices showed some pattern for different network characteristics. Among all the features, nine features were selected to perform the classification: a bag of a port number, a bag of domain names, a bag of cipher suites, flow volume, DNS interval, flow duration, sleep time, flow rate, and NTP interval. All the data with these features were fed into a two-stage machine learning model, which predicted the confidence for a particular type of IoT device. The first stage, called Stage 0, implements the Naive Bayes Multinomial Classifier independently for features, viz., a bag of the port number, bag of domain names, and bag of cipher suites. Each of these features returned moderate accuracy for different devices. Then the output of all these three classifiers is fed into the second stage, i.e., Stage 1. In Stage 1, the outputs from the Naive Bayes Multinomial Classifier and the remaining six features were fed into the Random Forest Classifier,
  • 4. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 813 which provided the required confidence for each input. This confidence is then mapped to the IoT device’s classes, and the authors obtain the desired classification of IoT devices. Lopez, et al.13 proposed a network traffic classifier (NTC) to predict the network services among different communication protocols. The proposed network traffic classifier considers the flow-based features, e.g., source and destination ports and bytes transmitted per packet. The authors used the different deep learning models and their combinations (RNN, CNN, and CNN+RNN) to infer the network services. NTC uses the RedIRIS dataset consisting of 266,160 network flows from 108 services to validate the approach. The results show that the combination of CNN+RNN has the best accuracy in terms of F1 score, precision, and recall. Table 1 summarises each of the papers discussed in this section considering category, classification methods used, selected features, traffic (simulated or real), and year. Paper Category DL Method Features Traffic Year Falahatraftar[14] Congestion Control Regression Neural Network No. of vehicles, data rate, transmission power, and bandwidth OMNeT++ 2022 Demir [15] Congestion Control SVM RTT, RTO Cooja Network 2020 Sander[10] Congestion Control RNN+LSTM Packet arrival Mininet 2019 Xiao[7] Congestion Control Deep Reinforcement Learning Size of congestion window, RTT, inter-arrival time of ACK NS-3 2019 Najm[17] Congestion Control Decision tree CWND, Throughput, Queue Size, Packet Loss NS-3 2019 Doshi[11] DDoS attacks KNN,LSVM,DT,RF Packet header fields, flow information over very short time windows Consumer IoT device network 2018 Sivanathan[12] IoT device classification Multi-stage machine learning Activity cycles, port numbers, signaling patterns, and cipher suites Real network Traffic 2018 Lopez [13] Protocol-wise traffic Classification CNN+RNN Source port, destination port, payload size, TCP window size, inter-arrival time, and flow direction Real data from RedIRIS 2017 Table 1. State-of-the-art comparison 3. PROBLEM STATEMENT AND METHODOLOGY Real-time traffic volume prediction plays a vital role in proactive network traffic management. Many forecasting models have been proposed to address this issue.26-27 However, most of them suffer from the inability to fully utilize the rich informationintrafficdatatogenerateefficientandaccuratetraffic predictions from longer-term data (e.g., seven-day predictions at a five-minute interval). The accuracy of prediction of traffic flow depends on the availability of historical data along with a reliable technique that can filter the intrinsic properties from the dataset to precisely forecast future flows. Let, the IoT network has the N number of IoT devices. R is a volume of real network traffic at a time t at the gateway node represented by (R, t). R at (t + n) is the volume of real network traffic for the next time point t + n and is represented by (R, t + n). The traffic volume prediction problem at time t is defined to find the predictor of (R, t) via a sequence of historical or training traffic datasets (Rt−1 , Rt−2 , Rt−3..., Rt−T ). The key problem here is to develop or apply a model that defines the inherent relationships between the historical or training dataset and prediction of network traffic so that the network system can be able to accurately predict the traffic volume to control congestion. 3.1 METHODOLOGY This section describes a proposed methodology for predicting congestion in the IoT network as shown in Fig. 1. The big issue in our work was the lack of a publicly available labeled dataset of congestion scenarios for IoT networks. To the best of our search, we could not find any such dataset to solve the problem statement. Therefore, as the first step, we set up our own IoT home testbed, as shown in Fig. 2. There are two main reasons to design a home IoT testbed (i) to capture network traces from real-world IoT devices and (b) to develop a realistic smart environment for evaluating the proposed approach. Figure 2 describes the proposed IoT testbed environment. In our IoT network setup, we use the computer (Ubuntu installed) having two network interface cards (NIC1, NIC2). NIC 1 serves as a gateway and connects to the Internet, while NIC 2 uses an L2 (S.No 13) switch to connect to the rest of the network. We use a 4G TP-Link MR6400 (S.No 14) router running CoAP, DHCP, NAT, and DNS forwarding. The public Internet is accessible through this TP-Link. An Unifi Access Point (S.No 12) is connected to the L2 switch. The devices that are connected with Ubiquiti Access Points are as follows. • General-purpose hubs (3, 7) • Several consumer electronics (4, 5, 6), • Two smart plugs (9) can accommodate extra offline devices • Environmental devices (1, 8, 2). With the L2 switch, the device-specific hubs (10 and 11) are linked. These hubs control the devices that are plugged into them. L2 switch routes all incoming and outgoing traffic,
  • 5. Def. SCI. J., Vol. 72, No. 6, november 2022 814 which mirrors all testbed traffic on the switch port attached to the Ubuntu PC. On the Ubuntu PC, we set up the following packages: Wireshark (for traffic capture), block-mount package kmod-USB-core (for mounting external USB base memory), and package kmod-USB-storage (for storing the network traffic traces on USB storage). A bash script is run to automate the data collecting and storage. This execution of script records pcap files on an external USB storage device that is connected to the PC. We collected the network traffic traces after the testbed has been set up. We introduced a burst into the network to generate congestion because the goal is to predict network congestion. We collect the network traces in the form of pcap files using Wireshark. This network data contains traffic traces and session logs of different devices at network layers as per the requirement of applications. For example, congestion detection problem in IoT network often needs datasets of packet-level traces labeled along with corresponding classes. Wireshark’s filters are used to segregate congestion states from the normal traffic states. We filtered congestion traffic with the help of Bad TCP, ICMP errors, Retransmission, and Duplicate ACK filters and stored this traffic in the pcap file named “congestion.” The rest of the traces of normal traffic are stored in another pcap file named “normal.” Figure 1. Methodology. Figure 2. IoT home testbed for data collection. We use CICFlowmeter to convert these two pcap files to comma-separated values (CSV) files. CICFlowMeter is a network traffic flow generator and analyzer that produce 83 network traffic features from the pcap files. Hence, there are 83 columns in the CSV file. CICFlowMeter provides bidirectional flows from source to destination and destination to a source with duration, number of packets, number of bytes, length of packets, etc. These features are calculable separately in both directions. These columns contain low-level features (Flow ID, Source IP, Destination IP, Source port, Destination port, and protocol) and other high-level features. We added class labels (Congestion, No Congestion) in the last column of both CSVs. WeperformtheExploratoryDataAnalysis(EDA)29 fordata pre-processing. EDAhelps to find various information about the data, such as missing values, normalization, mean categorical data, median, distribution of data, and correlations. We carry out the EDA using the Profiling tool. After removing all the noise, inconsistency, and missing values, we obtain a complete and accurate dataset. Now, we start looking for relations among various features of the data. The feature selection process can be done manually or using some mathematical or algorithmic techniques. In the case of manual feature selection, one must
  • 6. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 815 have a thorough understanding of the features, the relationship between them, and how one feature group would perform better than the other feature groups. On the contrary, algorithmic feature selection lowers the role of manual efforts. In addition, compared to manual feature selection, it avoids errors in feature selection and takes less time to identify the most trustworthy features. Therefore, algorithmic feature selection is preferred over manual selection. In this study, we employ a correlation matrix and Principal Component Analysis (PCA), a popular feature selection algorithm, to analyse the relationship between various features inthedataset.30 Thecorrelationmatrixisasymmetricmatrixthat defines the relationship between various fields. In mathematical terms, a correlation matrix is a matrix of correlation coefficients between a set of variables. The diagonal of the correlation matrix is filled with 1s because the correlation between the same variables is 1. The coefficient is between -1 and 1, where 1 shows a 100 % positive linear correlation, while -1 depicts a 100 % negative linear correlation. The correlation matrix is a symmetric matrix that defines the relationship between various features. In mathematical terms, a correlation matrix is a matrix of correlation coefficients between a set of features. The coefficient is between -1 and 1, where 1 shows a 100 % positive linear correlation, while -1 depicts a 100 % negative linear correlation. The diagonal of the correlation matrix is filled with 1s because the correlation between the identical features is 1. Furthermore, 0 means the two features are not correlated. The Pearson correlation coefficient31 metric is used to evaluate the correlation between the features. It measures the linear correlation between the two features using Eqn. 2. We are using pandas profiling to find the highly correlated features. ( ) ( ) ( ) ( ) Cov X,Y X,Y = Var X Var Y ρ (2) In our work, the correlation is considered high if the Pearson correlation coefficient is larger than 0.9 or less than -0.9. We started with the dataset consisting of 83 features. We dropped 30 highly correlated features, along with the features with constant values such as 0 simultaneously. After this, there were still 53 features and one field for the label. All 53 of these features were useless for our prediction, thus, we eliminated all irrelevant information and retained only those necessary for the system to learn. The following columns, as indicated in Table 2, are removed during the learning process based on the results of pandas profiling. Flows ID, Destination IP, Source IP, and Timestamp are categorical and sparse features that increase the space and time complexity of the models. The historical time series data, which contains a timestamp, is used to predict the future. As a result, we retain only the timestamp and drop the Flow ID, Destination IP, and Source IP. We use one-hot encoding32 to transform this categorical feature into a numerical attribute. Now, we have 50 features in our dataset. For feature selection, we used PCA. On applying PCA to our dataset, we got useful features such as Flow Duration, Fwd Pkt Len Mean, Flow Pkts/s, Bwd IAT minimum, Fwd Pkts/s, and Bwd Pkts/s. Initially, we decided to go with these seven features along with a timestamp for prediction and classification. However, since deep learning approaches are capable of recognizing features automatically from data, we are not overly concerned with selecting more features. We also study IoT traffic flow characteristics, including flow size, flow duration, flow rate, and inter-arrival time (∆T) in forward and backward directions. These characteristics will also help advance prediction of congestion on the nth timestamp. We also use inter-arrival time (∆T) and flow rate as the target variable for advance prediction. We provide definitions for some terms related to network traffic flow. • Flow size is the number of bytes sent with the payload, including retransmissions. • We can measure flow duration as the difference between the timestamp of the first packet and the last packet in a flow. • Flow rate is the ratio of flow size to flow duration and is represented with flow packets/second Flow size is a valuable statistic that is used to optimize routing algorithms, load balancing, and scheduling in IoT networks. Predicting flow size is especially difficult because flow patterns change constantly, and calculations must be made in real-time (milliseconds) to prevent delays. This is because the size of flows varies greatly (from small mice flows of a few kilobytes to large flows of several gigabytes), as Active Max Subflow Bwd Packets Bwd IAT Max Active Min Idle Min Fwd Bulk Rate Avg Bwd PSH Flags Packet Length Min Bwd URG Flags Idle Max Fwd IAT Total Bwd Bytes/Bulk Avg Bwd Segment Size Avg Fwd Packet/Bulk Avg Fwd URG Flag URG Flag Count Active Mean ECE Flag Count Fwd Bytes/Bulk Avg ACK Flag Count Fwd Packets/s Bwd Packet Length Std Subflow Bwd Bytes Fwd IAT Std Bwd IAT Std Fwd IAT Mean Active Std Packet Length Mean Fwd IAT Max Fwd Segment Size Avg Table 2. Features to be dropped
  • 7. Def. SCI. J., Vol. 72, No. 6, november 2022 816 well as the length of time they inhabit a route (milliseconds to hours). Since large flows hold the route for a longer time (flow duration) than small flows, there’s a chance that the number of active flows will become unbalanced due to a concentration of large flows in the routing path. The quantity of available data determines the selection of a deep learning model. The data patterns may or may not be consistent due to the irregular traffic flow. The chances of the overfitting of the model may be high when we collect the data from these networks. The selection of the model in these types of cases is crucial. For example, accurate congestion prediction can suggest how much the sending rate of a node in the IoT network should be controlled. The deep learning algorithms may reduce the chances of overfitting. Motivated by the excellent performance of deep learning techniques in images and natural language processing, we select TCN, LSTM, Gated recurrent unit (GRU), Stacked autoencoder (SAE), and CNN-LSTM models for the solution of proposed IoT network traffic classification and prediction approach.33 These techniques have the capability of automatically learning the features. The model validation verifies the overall accuracy of the model in terms of overfitting and underfitting. This verification of accuracy helps to optimize the model by increasing the volume of data and reducing the complexity of the model when data overfits. Analysis of wrongly classified samples helps to find the reasons for errors and check the suitability of the model and features. The processed dataset is divided into three parts, i.e., training (60 %), validation (20 %), and testing (20 %). We train different models with training data using various combinations and different values of hyper- parameters. Further, re-training is done on the different training and validation dataset combinations using the same set of hyper-parameters, and the accuracy is calculated for testing the dataset. We observe overfitting initially, as the training score is high, but the validation score is low. This score shows that the model has adapted well to the specific training dataset. However, the trial-and-run process to train a model is time- consuming. Therefore, we have used the Taguchi method to optimize the hyper-parameter and train the model. This method reduces the number of trials to optimize hyper-parameters. 4. MODEL CONSTRUCTION AND PARAMETER TUNING In this section, we represent the basic architecture for convolution sequence forecasting and hyper-parameter tuning which is an integral part of the neural network. The series of IoT network traffic flow of length T with the input sequence {p1 ,p2 ,.....pT } is utilized for predicting the output sequence of traffic flow {q1 ,q2 ,.....qT }. Here, observed data {p1 ,p2 ,.....pt } is the input to the model for the forecast of the output t q at the time t. Equation 3 defines the function t t f:P Q → prediction of the traffic flow. { } ( ) { } 1 2 T 1 2 T q ,q , q = f p ,p , p …… ……    (3) Here, the function f must hold the condition that the t q depends only on historical data { } 1 2 t 1 p ,p ,......p − rather than { } t+11 t+2 T p ,p ,......p . Equation 4 defines the function to forecast network traffic flow t q at time t as follows. ( ) { } t 1 2 T q = f p ,p , p ……  (4) 4.1 Temporal Convolutional Network TCN34 isthevariationofCNNforsequencemodelingtasks. TCN is designed by combining the features of the RNN and CNN. TCN uses casual convolutions and dilated convolutions. The casual convolution is used for temporal data and ensures that the model will not violate the sequence in which the data is modeled. The prediction relies only on the historical data, not on future observations. TCN model inherits the property of 1-D fully convolution of generating the same length output as the input, and subsequent layers maintain the same length by adding zero padding to avoid information leakage. The accurate prediction of the traffic flow requires a large history and a very deep network. This makes the network structure of the model complicated and computationally intensive. The integration of dilated convolutions and residual layers with TCN solves this issue. Dilated convolutions, in particular, allow for an exponentially large receptive field to cover large history. Equation 534 shows the dilated convolution operation F on elements of the sequence s for the given 1-D sequence input n p R ∈ and a filter { } f: 0.k 1 R − → . ( ) ( )( ) ( ) k 1 f s d.i i=0 F s = p d s = f i .p − − ∗ ∑ (5) Here, d represents the dilation factor; i stand for past direction, k denotes the size of filter f, and i d s . − maintain the history. The value of dilation factor d = 1 for dilated convolution in equation 5 works as the standard convolution. The exponential adjustment to d increases the receptive field size that covers the large history with the depth of the deep learning network. This represents the large range of inputs by leading the output at the top level. Filter size k is used to increase the receptive field of TCN. However, this approach increases parameter count and execution time, resulting in slow and overfitting of the model. The residual connection is the next important part of TCN that includes a branch towards a series of transformations F, whose outputs q are connected to the input p of the block. Figure 3 shows the diagram of the residual block. This block contains two weight layers along with the Rectified Linear Unit (ReLU) activation function. Moreover, regularization is achieved by connecting a spatial dropout layer after the last weight layer. Here, Eqn. (6) defines the residual block mathematically. ( ) i q = F p,W + p (6) Here, q is the output layer. The function F(p,Wi ) shows the residual mapping to be learned, where Wi denotes the weights of ith layer. The function for two layers of TCN is represented as 2 1 F W (W p) e, = σ + in which σ and e represent ReLU and
  • 8. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 817 bias, respectively. To predict and detect network congestion in a home IoT network, we use TCN architecture. TCN architecture is made up of a collection of blocks; each block contains a set of N number of convolutional layers. Layers are made up of dilated convolutions with a dilation factor d and a nonlinear activation function f(.) associated with it. The convolution outcome is integrated with the layer’s input by using a residual relation in each dilated convolution. ( ) ( ) t 2 i i=0 1 L = q F i t − ∑  (11) Here, Fig. 4 shows TCN model architecture that uses 19 convolutional layers and a fully connected layer. Convolutional layers are shown in a rectangular area with a grey background. In Figure 4, the grey rectangles in the shaded rectangle show that the result of convolutional layer 1 is applied as the input of convolution layer 3 by using the residual structure directly. The kernel size 10 and filter size 24 are used for all the convolutional layers. TCN model is trained with the processed training data. The final TCN with the enhanced configuration is transmitted to the central server depending on the feedback of the historical traffic flow data sequence and is used for forecasting packet flows/sec in the next 30 minutes in the network given the past traffic flow in the dataset. Figure 3. TCN residual block. Equation (7) describes the activation function for the ith layer and jth block ( ) F i.j w T S R × ∈ (7) Here, we can note that every layer has the same number of filter Fw . We define the output of the dilated convolution at time t as j,l t S  and the convolution result after the residual connection l j, S as shown in Eqn (8). ( ) j,l 1 j,l 1 2 j,l 1 t t d t S = f W S + W S + b − − −  j,l j,l j,l t t S = S + VS + e (8) Here, 1 W and 2 W denote the weight parameters. F F w w V R × ∈ is the set of weights and Fw e R ∈ represents the biases for the residual, respectively. The output of each block is summed by a group of skip connections with T F 0 w t Z R × ∈ satisfying Eqn. (9) B 0 j,L t t j=1 Z = ReLU S ∑ (9) The latent result 1 t Z is ( ) 0 r t r ReLU V Z + e for the weight matrix Vr and the bias er . We can represent the forecasting through Eqn. (10) ( ) ( ) 1 t t q = softmax UZ + c  (10) Here, U is the weight matrix C Fw U R × ∈ and bias C c R . ∈ Equation 11 shows the objective function of the model. We aim to minimize the value of L by keeping the training of the data. At time t, qt gives the forecasting result. The initial traffic flow at time t is represented by F(t). Parameters are configured as follows at the time of training of the TCN model; batch size 128, number of epochs 30, dropout as 0.5, and initial learning rate as 0.002. We use the stochastic gradient descent technique to decrease the learning rate at the time of training. Figure 4 TCN architecture. 4.2 Hyper-parameter Tuning Building a deep learning model for network traffic flow prediction requires the best combination of neural network parameters such as kernel size, number of filters, dilations list, count of residual block stacks, etc. Traditionally, a trial- and-error approach is used to find out the best combinations of these hyper parameters. However, it is inefficient in yielding high precision and time-consuming because it requires a huge number of trials. This prompted us to look for more coherent alternatives to the traditional trial-and-error approach. Instead, Zhao, et al.34 used the Taguchi approach to predict short-term traffic movement accurately or identify congestion sites for intelligent transportation systems (ITS) in smart cities. We believe that the problem of road traffic congestion is similar
  • 9. Def. SCI. J., Vol. 72, No. 6, november 2022 818 to the problem of network congestion, so we also incorporate the Taguchi method to determine the optimized values of the hyper-parameters of the deep learning model for IoT network congestion prediction. We have executed a small series of experiments to see if we could improve the structure of deep learning models that were trained by the training dataset. The trialrunresultsshowthedisparitybetweentheexpectedandreal performance. The findings of the trial run help determine the most suitable hyper-parameter values for efficient performance metrics. The topology of the deep learning-based TCN model is then determined. The Taguchi approach is divided into three parts: (i) Design factors identification, (ii) Trial design and performance metrics, and (iii) Optimization of the model structure and performance analysis. The first step of the Taguchi method is identifying the design factors and their corresponding levels. These design factors greatly influence the topology determination of the deep learning models. The number of filters used in deep learning models is considered the first design factor. The state- of-the-art shows that the most frequent number of filters used is 6, 12, and 24. We follow the same practice and use the 6, 12, and 24 filters at Levels 1, 2, and 3, respectively. The size of the kernel is the second design factor. We use the 10, 15, and 20 kernel sizes at each layer of the deep learning model for Levels 1, 2, and 3, respectively. The list of dilation is the third design factor. The list of the dilation gives the size of the deep learning model. We set the list A {20 , 21 , 22 , 23 , 24 }, list B {20 , 21 , 22 , 23 , ..., 29 }, and list C as {1, 2, 3, ..., 9} for Levels 1, 2, and 3, respectively. The number of stacks of residual blocks is the fourth design factor. The state-of-the-art suggests the use of 2 to 9 stacks of residual blocks. We set the numbers 2, 6, and 9 stacks of residual blocks for Levels 1, 2, and 3, respectively. We assess the performance of deep learning models for congestion prediction using Mean Absolute Error (MAE) in Eqn. (12) and Mean Relative Error (MRE) in Eqn. (13). These performance metrics show the absolute and mean difference between the actual network flow and the one predicted by the model, respectively. ( ) n n i i i=0 i=0 1 MAE = f f n − ∑∑  (12) ( ) n i i i=o i f f 1 MRE = n f − ∑  (13) Here, i f  and i f are actual and predicted traffic flow duration respectively. Here, n shows the number of forecast points. Lower MAE and MRE are the indicators of less difference between the actual flow and predicted flow, which shows better accuracy of prediction. As we have three levels and four design factors, the full factorial design requires a total of 81 trials. As prediction problem requires a large amount of historical data, full factorial design may be time-consuming. Therefore, we use fractional factorial designs. As with the consideration of four design factors along with three levels for each, we build an orthogonal arrayA16 (34 )forthetrialdesign.Wehaveoptimizedthestructure of the deep learning model by executing the combination of design factors along with their levels in 16 trials. Every row of the orthogonal array A16 (34 ) related to the main trial develops a deep learning model using a training dataset of the home IoT testbed. Table 3 shows the results of the 16 main trials for the seven days (March 1-7, 2021), along with the average result of the testing dataset. The accuracy of deep learning models is evaluated by the collected data of the Home IoT testbed over seven days with 16 main trials. The results in Table 3 depict Main Trials Level Day MAE MRE Average Result of 5 Days i ii iii iv MAE Rank MRE Rank 1 1 1 1 1 1st 65.783 0.899 59.532 16 0.447 16 3rd 17.12 0.335 5th 44.082 0.125 2 1 2 2 2 1st 20.991 0.255 21.732 7 0.1 5 3rd 0.128 0.004 5th 3.682 0.048 3 1 3 3 3 1st 23.323 0.286 24.258 10 0.11 8 3rd 1.387 0.029 5th 1.485 0.02 4 1 1 2 3 1st 293501 0.348 29.729 15 0.165 14 3rd 5.987 0.144 5th 8.229 0.108 5 1 1 3 1 1st 33.211 0.396 28.502 14 0.164 13 3rd 7.532 0.168 5th 2.621 0.033 Table 3. Orthogonal array A16 (34 ) and experimental outcomes
  • 10. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 819 that the smallest MAE and MRE were obtained within the 16th trial with 24 numbers of filters, 2 numbers of stacks, a 15 kernel size, and the dilation with list A. In Table 3 top three trials are the 7th , 11th , and main trial 16th with list A setting of dilation. This concludes that the dilation with listAgives better accuracy than the other two lists. The main trials 1st , 4th , and 5th with the 6 filters and small kernel size perform poorly concerning MAE and MRE. This concludes that the performance of the deep learning models varies with the change in the combinations of the parameters. Every trial under the Taguchi method has orthogonal combinations. Hence, we can separately analyze the effect of individual design factors. The effect of every design factor at a specific level is determined by averaging the respective values in Table 3. In particular, the design factor i with Level 2 is in 6th , 7th , 8th , 9th , and 10th main trials, and we can find its average effects 23.9428 (MAE) and .121 (MRE). Table 3 shows the effects of all four design factors at all their corresponding three levels. The sensitivity (identification of true positive) of design factors is also determined using range analysis, which is defined as the difference between the highest and lowest performance values for each design factor. Table 4 displays the sensitivity data. We can find the order of the sensitivity by sorting the sensitivity values of four design factors from high to low, as i>ii>iv>iii in terms of MAE. The same concept applies to finding the order of the sensitivity in terms of MRE, and the order is as i>ii>iv>iii. These orders depict that the design factor i has the lower effect value of MAE and MRE; hence, it is considered the primary factor for the TCN forecasting model for better performance. Here, Tables 3 and 4 help to determine the optimized parameters of the TCN model for the IoT network traffic flow forecasting in the Home IoT testbed. These parameters are the design factors and i, ii, iii, and iv with values that exist at Level 2. TCN model 6 2 1 2 3 1st 23.334 0.275 20.936 4 0.104 7 3rd 1.876 0.044 5th 1.997 0.025 7 2 2 1 3 1st 16.305 0.201 20.326 3 0.093 3 3rd 1.249 0.027 5th 4.948 0.066 8 2 3 3 1 1st 34.147 0.397 28.396 11 0.146 11 3rd 1.771 0.038 5th 5.597 0.071 9 2 1 3 3 1st 29.527 0.342 28.5 12 0.162 12 3rd 4.212 0.095 5th 11.494 0.149 10 2 3 1 3 1st 19.635 0.229 21.556 5 0.1 5 3rd 1.916 0.044 5th 3.414 0.043 11 3 3 1 2 1st 13.916 0.165 20.029 2 0.092 2 3rd 2.838 0.064 5th 4.823 0.062 12 3 3 2 1 1st 39.97 0.466 28.755 13 0.197 15 3rd 9.729 0.218 5th 9.857 0.128 13 3 1 1 1 1st 1.488 0.018 21.785 6 0.094 4 3rd 5.184 0.116 5th 9.103 0.117 14 3 1 2 1 1st 22.765 0.255 24.963 9 0.115 9 3rd 0.436 0.015 5th 5.314 0.067 15 3 2 2 2 1st 33.132 0.387 24.834 8 0.143 10 3rd 4.307 0.094 5th 2.675 0.033 16 3 2 1 1 1st 20.191 0.234 16.156 1 0.081 1 3rd 0.306 0.008 5th 1.817 0.025
  • 11. Def. SCI. J., Vol. 72, No. 6, november 2022 820 with filter counts as 12, size of the kernel as 15, dilation list B, and 6 residual blocks gives the highest accuracy in forecasting among all the Taguchi experimental trials. Design Factor MAE MRE i ii iii iv i ii iii iv Level 1 32.6 30.5638 26.564 29.727 0.1972 0.1787 0.1511 0.1777 Level 2 22.3428 20.762 25.1581 22.1983 0.121 0.1042 0.1373 0.1116 Level 3 22.7536 24.5988 27.4135 24.2175 0.120 0.129 0.1455 0.1223 Sensitivity 10.2572 9.8018 2.2554 7.5287 0.0772 0.0745 0.0138 0.0661 Table 4. Sensitivity values for design factors 5. RESULTS On the basis of the prepared traffic dataset, we preformed several experiments. This work compares TCN with four other deep-learning methods commonly used as traffic flow predictors, including the LSTM, GRU, SAE, and CNN-LSTM, concerning two performance metrics, MAE and MRE, to show the TCN model’s prediction accuracy. All experiments are implemented in Tensorflow, and performance matrices are calculated using scikit-learn package 6. We use i7-9700, a desktop computer with 16 GB of RAM, in all experiments. Here, IoT traffic includes two types of traffic: first, traffic that flows through the devices autonomously independent of user activity; second, traffic that flows due to user communication with the devices, e.g.,Airveda SmartAir Quality Monitor responding toAndroid user, Amazon Alexa Echo, Neato vacuum cleaner, so on. Flow rate and inter-arrival time are the attributes used for prediction of congestion in our approach. We analyze the estimation of congestion in live network traffic further to test the success of the deep learning methods. The TCN- based approaches effectively anticipate congestion condition in contrasting situations and ground-truth information. As the nature of our IoT network traffic is unbalanced at some points, the error increases slightly during peak traffic hours. However, adding the dropout layer in the model to perform regularization results in a decreased gap between actual and predicted values during peak traffic hours. Hereby, we have a closer look at prediction when network traffic flow is high. Figure 5 shows that at 1:30 pm and 3:45 pm, there is low and high network traffic, respectively, and error increases in these fluctuations; however, in these situations TCN model follows the traffic trend. The interarrival packet time is an important feature as it shows that packets are received in regular intervals in the case of normal traffic, whereas in the case of congestion, the reception of packets has almost zero interarrival time. The first derivative, d T dt ∆ , and second the derivative 2 2 d T dt ∆ , of the ( ) T ∆ , capitalize the difference between normal traffic and congestion. We have also predicted the interarrival time of the packets at the gateway. Figure 6 shows prediction of the average interarrival time of the packets on 15th March 2021. The MAE, MRE, and forecasting accuracy in (%) of the Packet Flow/Sec with the different models (TCN, GRU, LSTM, SAE, and CNN-LSTM) are shown in Table 5. The results show that TCN has a better forecasting accuracy than other deep Figure 5. Observation of traffic flow using TCN. learning-based models for predicting IoT network traffic. It’s worth mentioning that the TCN model, when combined with Taguchi hyper-parameter optimization, achieves a forecasting precision of approximately 95 %, which is 15 % higher than the LSTM deep learning model and 10 % to 12 % higher than CNN-LSTM, SAE, and GRU. Furthermore, we claim that the MAE and MRE values for the TCN model are lower than other deep learning models, implying that the TCN model’s forecasting accuracy is superior to that of other models. The MAE value of the TCN model is 7.4237, which is much less in comparison to LSTM’s MAE value of 28.6085 and CNN-LSTM’s MAE value of 26.5688. In the case of MRE, the TCN model outperforms and has a lower value of 0.0457, which is one-fourth of the LSTM and GRU models, as shown in Table 5. Algorithm MAE MRE Forecasting accuracy TCN 7.4237 0.0457 95.52% LSTM 28.6085 0.1901 80.46% GRU 35.0772 0.1821 82.98% SAE 34.4004 0.1680 83.29% CNN-LSTM 26.5688 0.1401 84.98% Table 5. Performance evaluation of TCN with other deep learning models
  • 12. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 821 We show the output of the various deep learning models, including LSTM, SAE, GRU, and CNN-LSTM, along with the TCN model shown in Fig. 7. These models show the traffic forecasting for the nth day. The blue and red lines show the actual and predicted traffic flow. to enhance the structure of TCN which improves traffic flow forecasting. The other possibilities offered by deep learning models to detect and predict congestion in IoT network is thoroughly examined in this chapter. With the real network traffic, we evaluated TCN with LSTM, GRU, SAE, and CNN- LSTM and observed that TCN achieves better forecasting results. It demonstrates that TCN can efficiently predict and detect congestion in IoT networks. In the future, efforts will be made to improve the accuracy of the TCN model by focusing on inevitable conditions such as random bursts. References 1. Thompson, W.L. & Talley, M.F. Deep learning for iot communications. 53rd Annual Conference on Information Sciences and Systems (CISS), 2019, pp. 1-4. doi: 10.1109/CISS.2019.8693025 2. Al-Kashoash,Hayder&Ahmed,Abdulmohsin.Congestion controlfor6LoWPANwirelesssensornetworks:Towardthe internet of things. University of Leeds, 2017. PhD Thesis. doi: 10.1007/978-3-030-17732-4 3. Wang, M.; Cui, Y.; Wang, X.; Xiao, S. & Jiang J. Machine learning for networking: Workflow, advances and opportunities. IEEE Network, 2017, 32(2), 92-99. doi: 10.1109/MNET.2017.1700200 4. Seo, J.; Lee, S.; Khan, M.T.R. & Kim, D. A new coap congestion control scheme considering strong and weak rtt for IoUT. In Proceedings of the 35th Annual ACM symposium on applied computing, 2020, pp. 2158–2162. doi: 10.1145/3341105.3373981 5. Ibrahim; Amin S.; Khaled, Y. Youssef; Ahmed, H. Eldeeb; Mohamed, Abouelatta & Hesham, Kamel. Adaptive aggregation based IoT traffic patterns for optimizing smart city network performance. Alexandria Eng. J., 2022, 61(12), pp. 9553-9568. doi: 10.1016/j.aej.2022.03.037 6. Ancillotti, E. & Bruno, R. Comparison of coap and cocoa+ congestioncontrolmechanismsfordifferentiotapplication scenarios. In 2017 IEEE Symposium on Computers and Communications (ISCC), 2017, pp. 1186–1192. doi: 10.1109/ISCC.2017.8024686 7. Xiao, K.; Mao, S. & Tugnait, J.K. Tcp-drinc: Smart congestion control based on deep reinforcement learning. IEEE Access, 2019, 7, pp. 11892-11904. doi:10.1109/ACCESS.2019.2892046 8. Jiang, H.; Li, Q.; Jiang, Y.; Shen, G.; Sinnott, R.; Tian, C. & Xu, M. When machine learning meets congestion control: A survey and comparison. Comput. Networks, 2021, 192, 108033. doi: 10.1016/j.comnet.2021.108033 9. Zhou, K.; Wang, W.; Hu, T. & Deng, K. Time series forecasting and classification models based on recurrent with attention mechanism and generative adversarial networks. Sens., 2020, 20(24), 7211. doi: 10.3390/s20247211 10. Sander, Constantin; Jan, Rüth; Oliver, Hohlfeld & Klaus, Wehrle. Deepcci: Deep learning-based passive congestion Figure 6. Observation of inter-arrival time using TCN. Figure 7 shows that forecasting the average flow in packet/ sec from theTCN model yields improved results than other deep learning models. We measure the time taken to train the model and also prediction time required by the model. Integration of the Taguchi method with TCN took approximately an hour to train the model for optimizing the configuration of hyperparameters. TCN model takes less than one second to forecast the IoT network traffic flow subject to giving all past input sequential data. To conclude, the TCN model achieves the highest accuracy and very low forecasting difference between ground truth and predicted value than LSTM and GRU models for IoT network traffic flow prediction. Therefore, TCN based approach is effective and promising for predicting real IoT network traffic. Figure 7. Forecasting of IoT network traffic with different deep learning models. 6. conclusion This article contributes to meliorating the existing alternatives and potentialities of congestion detection and prediction techniques in present network monitoring systems, mainly focused on IoT networks, where detection and prediction of congestion in the network are highly desirable. In this paper, in contrast to state-of-the-art, we use the deep learning-based TCN model. To the best of our knowledge, we are pioneers in applying the TCN for congestion detection and prediction. Furthermore, we incorporate the Taguchi method
  • 13. Def. SCI. J., Vol. 72, No. 6, november 2022 822 control identification. In Proceedings of the 2019 workshop on network meets AI & ML, 2019, pp. 37-43. doi: 10.1145/3341216.3342211 11. Doshi, R.; Apthorpe, N. & Feamster, N. Machine learning ddos detection for consumer internet of things devices. In 2018 IEEE Security and Privacy Workshops (SPW), 2018, 29-35 . doi: 10.1109/SPW.2018.00013 12. Sivanathan, A.; Gharakheili, H.; Loi, F.; Radford, A.; Wijenayake, C.; Vishwanath, A. & Sivaraman, V. Classifying iot devices in smart environments using network traffic characteristics. IEEE Transactions on Mobile Computing, 2018, 18 (8), 1745-1759. doi:10.1109/TMC.2018.2866249 13. Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A. & Lloret, J. Network traffic classifier with convolutional and recurrent neural networks for internet of things. IEEE Access, 2017, 5, pp. 18042-18050. doi: 10.1109/ACCESS.2017.2747560 14. Falahatraftar; Farnoush; Pierre, Samuel & Chamberland, Steven. An intelligent congestion avoidance mechanism based on generalized regression neural network for heterogeneous vehicular networks, IEEE Transactions on Intelligent Vehicles, 2022, 1-13. 15. Demir, A.K. & Abut, F. mlcocoa: a machine learning- based congestion control for coap. Turkish J. of Electrical Eng. & Comput. sci., 2020, 28(5). doi:10.3906/elk-2003-17 16. Betzler, A.; Gomez, C.; Demirkol, I. & Paradells, J. Cocoa+: An advanced congestion control mechanism for coap. Ad Hoc Networks, 2015, 33, 126-139. doi: 10.1016/j.adhoc.2015.04.007 17. Najm, I.A.; Hamoud, A.K.; Lloret, J & Bosch, I. Machine learning prediction approach to enhance congestion control in 5g iot environment. Electron., 2019, 8(6), 607. doi: 10.3390/electronics8060607 18. Wang, L.; Guo, S.; Huang, W. & Qiao, Y. Places205- vggnet models for scene recognition. arXiv preprint arXiv: 1508.01667, 2015. doi: 10.48550/arXiv.1508.01667 19. Gong, S.; Naseer, U & Benson, T.A. Inspector gadget: A framework for inferring tcp congestion control algorithms and protocol configurations. In Network Traffic Measurement and Analysis Conference, 2020. doi:978-3-903176-27-0 ©2020 IFIP 20. Francois, Lavet V.; Henderson, P.; Islam, R.; Bellemare, M.G. & Pineau, J. An introduction to deep reinforcement learning. arXiv preprint arXiv: 1811.12560, 2018. doi: https://doi.org/10.1561/2200000071 21. Liu, J.; Han, Z & Li, W. Performance analysis of tcp new reno over satellite dvb-rcs2 random access links. IEEE Transactions on Wireless Communications, 2019, 19(1), 435-446. doi:10.1109/TWC.2019.2945952 22. Edwan, Talal A.; Iain, W. Phillips; Lin, Guan; Jon, Crowcroft; Ashraf, Tahat & Bashar, EA Badr. Revisiting legacy high-speed TCP congestion control variants: An optimisation-theoretic analysis of multi- mode TCP. Simulation Modelling Practice and Theory, 118, 2022. doi: 10.1016/j.simpat.2022.102542 23. Zong, Liang; Han Wang & Gaofeng Luo. Transmission Control Over Satellite Network forMarine Environmental Monitoring System. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(10), 19668-19675. doi: 10.1109/TITS.2022.3145881 24. Chowdhury, T. & Alam, M.J. Performance evaluation of tcp vegas over tcp reno and tcp newreno over tcp reno, JOIV. Int. J. on Inf. Visualization, 2019, 3(3), 275-282. doi:10.30630/joiv.3.3.270 25. Chen, P. & Yu, Q. Application of improved tcp-illinois algorithm in wireless network. Comp. Eng., 2019, 04. doi:10.1145/1190095.1190166 26. Xu, Z.; Zhang, Y.; Lin, X., & Wu, Y. Design of smart home, router based on openwrt and zigbee. Comp. Eng., 43(3) 2017, 94-98. doi:10.1109/CCSSE.2018.8724745 27. Yu, X.; Sun, L.; Yan, Y & Liu, G. A short-term traffic flow prediction method based on spatial temporal correlation using edge computing. Comp. & Electr. Eng., 2021, 93, 107-219. doi: 10.1016/j.compeleceng.2021.107219 28. Lopez-Martin, M.; Carro, B. & Sanchez-Esguevillas, A. IoT type-of-traffic forecasting method based on gradient boostingneuralnetworks.FutureGenerationComput.Sys., 2020, 105, 331-345. doi:10.1016/j.future.2019.12.013 29. Milo, T. & Somech, A. Automating exploratory data analysis via machine learning: An overview. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, 2020, pp. 2617- 2622. doi:10.1145/3318464.3383126 30. Choi, Jungjun & Xiye, Yang. Asymptotic properties of correlation-based principal component analysis. J. of Econometrics, 2022, 229(1), 1-18. doi: org/10.1016/j.jeconom.2021.08.003 31. Benesty, J.; Chen, J.; Huang, Y. & Cohen I. Pearson correlation coefficient. In Noise reduction in speech processing, Springer, Berlin, Heidelberg, 2009, pp. 37- 40. doi: 10.1007/978-3-642-00296-0_5 32. Pargent, F.; Pfisterer, F.; Thomas, J. & Bischl, B. . Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features. arXiv preprint arXiv:2104.00629, 2021. doi:10.1007/s00180-022-01207-6 33. Lea, C.; Flynn, M.D.; Vidal, R.; Reiter, A. & Hager, G.D. Temporal convolutional networks for action segmentation and detection, In proceedings of the IEEE Conference on
  • 14. Jain, et al.: Congestion Prediction in Internet of Things Network using Temporal Convolutional Network 823 Computer Vision and Pattern Recognition, 2017, 156- 165. doi:10.48550/arXiv.1611.05267 34. Zhao, W.; Gao, Y.; Ji, T.; Wan, X.; Ye, F. & Bai, G. Deep temporal convolutional networks for short-term traffic flow forecasting. IEEE Access, 2019, 7, 114496-114507. doi:10.1109/ACCESS.2019.2935504 CONTRIBUTORS Dr Vinesh Kumar Jain completed PhD in the IoT domain from Malaviya National Institute of Technology, Jaipur, India in the department of Computer Science and Engineering. He has published and communicated many papers in SCI Journals. He has more than 18 year experience in teaching and Research. He is teaching theAutomata Theory, Compiler Design, and Networks. He has experience managing smart devices as well as designing and developing IoT networks. He has also worked in the fields of malware analysis and security. He is a member of consultancy committee for the Automated Driver Testing Systems RDTC & SIGAWALAjmer- Rajasthan. He designed and developed the proposed approach along with implementation for this study. Dr Arka Prokash Mazumdar received his PhD degree from Indian Institute of Technology, Patna. He is Assistant Professor at the Malaviya National Institute of Technology in department of Computer Science and Engineering. He is coauthor of 40+ technical contributions including papers published in journals and conferences. His research interests focus mainly on the Internet of Things. He helped in the creating the Home IoT network for this study. Dr Mahesh Chandra Govil was a full time professor with the Department of Computer Science and Engineering, Malaviya National Institute of Technology Jaipur, India. He is currently the Director of NIT Sikkim. His areas of interest include real time systems, parallel and distributed systems, fault tolerant systems, object detection, and cloud computing. He gave the technical review and improved the writing of the manuscript.