0% found this document useful (0 votes)
26 views

Congestion Control Prediction Model For 5G Environment Based On Supervised and Unsupervised Machine Learning Approach

Uploaded by

boinpallyvamshi3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Congestion Control Prediction Model For 5G Environment Based On Supervised and Unsupervised Machine Learning Approach

Uploaded by

boinpallyvamshi3
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Received 17 May 2024, accepted 27 May 2024, date of publication 19 June 2024, date of current version 9 July 2024.

Digital Object Identifier 10.1109/ACCESS.2024.3416863

Congestion Control Prediction Model for 5G


Environment Based on Supervised and
Unsupervised Machine Learning
Approach
MOHAMMED B. M. KAMEL 1,2 , IHAB AHMED NAJM 3, AND ALAA KHALAF HAMOUD 4
1 Department of Computer Science, University of Kufa, Najaf 54003, Iraq
2 Department of Computer Algebra, Eötvos Löránd University (ELTE), 1053 Budapest, Hungary
3 Department of Mathematics, University of Tikrit, Tikrit 43000, Iraq
4 Department of Cybersecurity, University of Basrah, Basrah 61004, Iraq

Corresponding author: Mohammed B. M. Kamel ([email protected]; [email protected])

ABSTRACT With the emergence of 5G technology, congestion control has become a vital challenge to be
addressed in order to have efficient communication. There are several congestion control models that have
been proposed to control and predict the possible congestion in 5G technology. However, finding the optimal
congestion control model is an important yet challenging task. In this paper, we examine the supervised and
unsupervised machine learning approaches to the task of predicting the possible node that causes congestion
in the 5G environment. Due to the huge variance in the domains of the data set columns, measuring
the prediction’s consistency was not an easy task. During our study, we tested twenty-six supervised and
seven clustering algorithms. Finally, and based on the performance criteria, we have identified the best five
algorithms out of the studied algorithms.

INDEX TERMS Machine learning, congestion control, 5G, supervised ML, unsupervised ML.

I. INTRODUCTION resources, including frequencies and bandwidth. Network


Compared to previous network generations, 5G networks slicing [6] and edge computing [7], which allow traffic-based
have higher speeds, lower latency, and improved coverage. optimization of 5G networks, may be utilized to achieve this.
These features and its superiority over previous generations Implementing Quality of Service (QoS) techniques ensures
resulted in its widespread adoption [1]. Due to the widespread that critical services remain unaffected by current traffic [8].
adoption and joining of a high number of nodes in the Important traffic, like emergency services, is assigned with a
network, many new challenges have been raised, especially higher priority, and resources in the 5G network are allocated
in the area of congestion control [2]. The goal of a routing appropriately. Another congestion control mechanism is
algorithm is to choose the best possible path and avoid any traffic offloading, which transfers the data traffic to Wi-Fi [9]
potential congestion; yet, it may result in additional costs or other networks. The offloading is done to decrease the load
during the routing process [3]. As it can result in severe on 5G networks, thus minimizing congestion.
delays and lower throughput, congestion during 5G routing In addition to the discussed approaches, applying machine
decisions becomes critical. learning (ML) algorithms has shown positive results in
Several studies have been made for implementing various controlling network congestion [10]. While unsupervised ML
congestion control approaches in the 5G environment [4], algorithms are trained with unlabeled data, supervised ML
[5]. Among the features of 5G networks that reduce algorithms are trained with labeled data [11]. In order to con-
congestion is the ability to dynamically distribute available trol congestion, both supervised and unsupervised algorithms
are trained to identify possible congestion nodes as well as
The associate editor coordinating the review of this manuscript and the optimal congestion control window. Classification is an
approving it for publication was Bilal Khawaja . essential part of supervised ML, where data items are grouped
2024 The Authors. This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 12, 2024 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 91127
B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

into classes based on the class labels information. On the scalability [17], and distributed telemetry [18]. To reduce
other hand, clustering is an essential part of unsupervised network congestion, enhance the lifetime of the network and
ML, in which similar data items are categorized into clusters individual nodes, and reduce network divisions, Shelke et al.
without the information of class labels. The adoption of [19] proposed a routing algorithm that selects the best route
supervised and unsupervised algorithms is dependent on by combining appropriate sleep scheduling mechanisms
several factors including data type and size, complexity, and based on the opportunistic theory. Godoy et al. [20] analyzed
accuracy. and investigated the communication channel congestion in
Our paper highlights the importance of adopting and the environment based on configuration parameters of nodes
utilizing machine learning algorithms in the process of such as the generation rate of the data packet, intervals of
congestion control in the 5G environment and identifies transmission time, and power level of transmitter output.
the top algorithms in the process of congestion control Najm et al. [21] proposed a multi-criteria decision-making
prediction. The task of finding the optimal algorithm to be mechanism to improve congestion control in 4G networks.
adopted for congestion control is challenging. We aimed to Braham et al. [22] proposed an efficient and fair distributed
find the optimal algorithm that predicts the optimal node algorithm for congestion control in tree-based communi-
causing congestion during the congestion control process in cation WSNs to assign transmission rates for each node.
5G networks. In our study, we tested twenty-six supervised The study lacked a performance comparison with previous
and seven unsupervised algorithms. Unsupervised machine traditional algorithms to see if it was optimal or not.
learning algorithms have been used for classification. The Although the next scenario was poor and simple, applying
approach of classification via clustering is utilized to improve machine learning algorithms, especially supervised ML,
the accuracy of congestion control prediction by clustering to improve congestion control in wireless or wired networks
data to identify distinct groups of data which be used to is considered a vital approach. Machine learning algorithms
enhance the classification process. Cronbach’s alpha has can be adopted in many fields [23], [24], [25], [26], [27], [28],
been used to measure the consistency undimensionality or [29], [30], [31], [32], [33] to predict the required knowledge.
homogeneity of datasets. During the evaluation, the studied Geurts et al. [34] proposed a model based on an automatic
algorithms were evaluated based on performance criteria, loss classifier based on a simulated database of random
including True Positive (TP) and False Positive (FP) rates, topologies of networks. Jagannathan and Almeroth [35]
precision, recall, Receiver Operating Characteristic (ROC), proposed a model called TopoSense for multi-cast congestion
and Area Under Curve (AUC). Fig. 1 shows the main steps of control. Many enhancements were required, such as the poor
the congestion control prediction model. calculation of link capacity and the need for calculating
The rest of this paper is structured as follows. In Section II, interval size. Moreover, there was a need to minimize control
we studied the related works in the field of congestion control. traffic and burst traffic.
Section III discusses machine learning and congestion control Following the trend, machine learning capabilities have
in detail. The model setting has been stated in Section IV. Our been utilized with congestion control algorithms in 5G
findings are explained and analyzed in Section V. We point environments. Several attempts have been presented, for
out our observations in Section VI. Finally, Section VII instance, in an open radio access network, a fast increase
includes our conclusions. in data based on artificial intelligence, and an adaptive
routing control approach to obtain effective congestion
II. LITERATURE REVIEW avoidance [36], [37], [38]. A controller is proposed by
Many studies have handled the congestion control approach. Sunny et al. [39] to ensure the efficient and fair work of
Sangeetha et al. [12] proposed a model based on data loss WLAN that has multi-cochannel access and improvement of
and energy reduction since congestion appears in all WSNs. long-lived multi-TCP AP transfer. Next, many researchers
The sensor nodes’ topology is adjusted regularly based on adopted DT in their studies of network applications.
node degree and time interval to enhance the node’s power Katuwal et al. [40] proposed a model to solve the
consumption and interference and to provide a better and problem of multi-class classification based on the multi-
more effective energy congestion-aware technique for routing classifier system. An efficient NN with oblique random
in WSN, which is called survival path routing (SPR). This forest DT is used to build the model. The model proved its
protocol is used by IoT applications in high-traffic networks efficiency based on the evaluation of 65 multi-class datasets
where all nodes try to send their packets simultaneously compared with the evaluation of large or medium datasets.
to destination nodes [13]. A new algorithm for congestion Gomez et al. [41] compared many ensemble algorithms of
control for WSNs is developed by Singh et al. [14], where DT and proposed a new classifier based on its performance.
a simplified poisson process is used and the optimal rate is The computed capacity for devices of a small network is not
obtained by retransmitting with congestion control, while the a limitation. A new model is proposed by Leng et al. [42]
old algorithm had a high complexity and high power usage. to solve the problem of congestion control flow table in a
Subsequently, many studies evaluated the performance of software-defined network (SDN) based on C4.5 DT. The flow
congestion control mechanisms over the 5G network [15] in entries are compared based on C4.5 DT to reduce the time
terms of resource allocation [16], network selection, network and matching cost. Using the DT approach with an SDN flow
91128 VOLUME 12, 2024
B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

FIGURE 1. Main steps in ML based congestion prediction.

table was the first online machine learning model. Next, the purpose of DT is to improve the factors used for estimating
clustering machine learning approach is used for localization vehicle decryption overhead.
and AP reselection. Liang et al. [43] proposed a model for Based on the results, DT was a better option than K-nearest
WLAN that adopts a clustering algorithm and AP reflection. and SVM for prediction because of its higher precision and
A review of the communication technology of machine- accuracy. Many researchers have highlighted network pro-
to-machine was conducted by Hasan et al. [44] to list all tection by utilizing the DT notion. For example, researchers
challenges and solutions for diverse standards of developing in [53], [54] presented an unknown detection threat approach
organizations. Liu and Wu [45] utilized the random forest in the network via recognition threat features. Following the
algorithm for congestion control prediction, where some trends, Mohamed et al. [55] developed a flexible scheme
variables were utilized to build the model, such as type of for reducing the quantity of data transmitted across the
day, road quality, time period, and weather conditions. smart grid, but the intended scheme missed mentioning the
Park et al. [46] proposed an approach utilizing Bayesian outcome of paradigm updates. Pham and Yeo [56] presented
neural network and DT to predict the occurrence of incidents. an adaptive and protected scheme for cars to control both
Since the aim of the model is to reduce the potential incidents confidentiality and trust in the utilized recognition scheme.
or any events that may cause these incidents, the model Next, Fadlullah et al. [57] highlighted and explored the survey
could not implemented in real systems since it needs realistic requirements of propagation techniques related to deep
parameters for training and building a dataset. An improved learning utilizations concerning numerous traffic network
route based on the support vector machine (SVM) with DT is control characteristics. The leading edge of peak network
used to estimate the link quality over WSN, where Shu et al. communications, which are compromised by algorithms and
[47] used two estimation parameters: link quality and the architectures in deep learning, also encourages the motivation
strength of the received signal. SVM is used in the model due to facilitate deep learning to compromise the network’s
to its ability to handle binary classifications. challenges. Nevertheless, their viewpoints did not include the
For network infrastructure and data centers, DT was 5G environment.
adopted as an energy-saving solution [48]. Soltani and Furthermore, Kong, Zang, and Ma [58] developed dual
Mutka [49] proposed an approach utilizing DT for best path machine-learning approaches to address TCP congestion
selection in the cognitive radio networks. In this model, the control issues in under-buffered connections over the wired
nodes can find better nodes to send data to after analyzing environment. A supportive and adaptive loss prediction was
the tree and removing the choices that reduce node gain. assigned to obtain a superior tradeoff delay. In research issued
DT is utilized to interpret the routing path of cognitive video by Taherkhani and Pierre [59] used the K-means algorithm
over a dynamic radio network. The optimal path from the to control congestion in VANET networks. It contains
leaf node to the root is determined based on background three sections for directing, detecting data congestion, and
induction to construct and receive the transmitted video. The clustering communications. Next, the issue of prediction
DT algorithm is also used by Stimpfling et al. [50] to build traffic status was settled by Chen et al. [60] by permitting DT
a model to enhance data structure size and memory access. and SVM to depend on enabling online data; however, the set
DT is considered a strength since it reduces the searching value of both services was overridden.
time. Moreover, DT is used by Singh et al. [51] to build a Tariq et al. [61] presented a detection of botnet attacks
model for vehicular traffic noise prediction. Four machine by using the machine learning technique, regardless of the
learning algorithms are used for model implementation: DT, explanation of the carried packet. However, comprehensive
ANN, generalized linear model, and random forest. The calculations missed the stats plan. Wu et al. [62] implemented
random forest approach was found to be a better algorithm a developable machine learning method to predict or expose
for prediction compared with other algorithms. Xia et al. the limps of online video via feature extraction of monitored
[52] used DT with a proposed delegation schema (CP-ABE) data in the network. The method defines characteristic
to enhance the efficiency of decryption for VANETs. The features depending on diverse scale windows. The criterion

VOLUME 12, 2024 91129


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

of used information gain, however, demands audio pieces to Forest) are used in many fields, such as market analysis,
be overridden, besides ignoring the cache amid the sending educational data mining, and estimating risks [68], [69], [70].
nodes and the video operator, constant section length, and When the DT is used for attribute selection, the data within
the limited interaction of the user. Quality of experience the data set is examined, and only the relevant data are
prediction is based on consultative factors, namely the chosen. Only the selected attributes appear in the graph after
Video Quality Model (VQM) and Structural Similarity Index excluding the irrelevant data [64].
(SSIM) parameters. Nevertheless, a justification for why k = A Naäve Bayesian classifier (belief network) is a graphical
9 assigned in a random forest was superior is shown by model that represents the random variables set as knowledge,
Abar et al. [63]. where each node represents the corresponding random
variable and the edges represent the conditional dependency
III. BACKGROUND between variables. These conditional dependencies represent
In this section, machine learning and congestion control have computational methods and statistical probabilistic theories.
been discussed in detail. The Naäve Bayes algorithm is a simple algorithm built
based on Bayesian theory, where it works basically on the
A. MACHINE LEARNING conditional probability [67].
Machine learning is a very fast-growing path that can be Artificial Neural Network (ANN), where the model is
classified into supervised, unsupervised, semi-supervised, implemented like a human brain neuron network. The ANN
and active ML. Supervised ML refers to classification, where and biological brain are similar in two keys: the connections
the labeled examples of the training data set determine the between the neurons that determine the network function
learning in supervised ML. Unsupervised ML is a synonym and the building blocks of the computational devices [71].
for clustering, where the input classes in the training data The multilayer perceptron algorithm is one of the ANN
set are not labeled. Unsupervised ML is essentially used to algorithms that work on training datasets by gathering
discover hidden patterns within data sets. Semi-supervised information by minimizing the error and applying that
refers to using both labeled and unlabeled examples during information to the new dataset [67]. ANN is used for
model learning, labeled for learning the model and unlabeled prediction, pattern recognition, optimization, and control, and
for refining class boundaries. Active learning permits users to many researchers have used ANN to solve many problems in
play an active role in the machine learning process [64]. The many disciplines. ANN is used for many reasons, such as low
basic element in the machine learning model is the dataset. energy consumption, adaptivity, learning ability, distributed
The size of the dataset, the features’ domain, the number computation and representation, massive parallelism, and
of features, and the data type of all features are essential fault tolerance [72]. Support Vector Machine (SVM) is an ML
elements in the machine learning model. The size of the data algorithm that learns from training data sets and assigns labels
set affects the overall accuracy since the training data set to dataset objects. SVM can be used in many disciplines, such
will be larger and the learning model will have more data as fraud detection, anomaly detection, image recognition,
to learn from. Many reasons can improve the accuracy of gene classification, and educational data mining [73].
the classification algorithm, such as data cleaning, adding
missing values, increasing the data set, feature selection and
transformation, bagging, and boosting. 2) UNSUPERVISED MACHINE LEARNING
The synonymous term for unsupervised machine learning is
1) SUPERVISED MACHINE LEARNING clustering, where the information of the class label is not
There are many algorithms classified as supervised ML: presented. The clustering approach is defined as grouping
Classification and Regression trees (CART) were introduced similar data items into clusters. The clustering algorithms
in 1984 by Breiman based on splitting the explicative are provided with data items with no labels, and the task of
variables’ space into multidimensional rectangle form and these algorithms is to represent the data distribution suitably.
a local predictor with each one of them. The tree structure The learning approach in unsupervised machine learning
development approach came from the recursive partitioning is based on observation, while the supervised machine
of the data set into two homogeneous data sets, which led learning approach is based on learning by examples. For large
to building a branching structure [65]. The regression model databases, efforts have been focused on exploring the most
captures how one or more variables vary across more attribute effective method for efficient cluster analysis. Clustering
domains, which can be used to predict the target variable [66]. is used in exploring the different data types with different
The DT approach is a frequent approach that comes from types of sizes, such as complex shapes, graph clustering,
the tree-based approach to data classification. The reliability, image clustering, and object clustering with a huge number
low cost, and ease of implementation are the reasons behind of features [64], [74].
adopting such an approach [67]. The basic structure of the The process of cluster analysis is based on partitioning
decision tree starts with one node and branches to many other the similar objects in the data sets into subsets, where each
nodes, forming the tree graph. DT algorithms (ID3, CART, subset represents a cluster. Based on the partitioning method,
J48, RepTree, Decision Stump, Hoeffding, and Random different clustering algorithms may result in different clusters

91130 VOLUME 12, 2024


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

for the same data set. This partitioning or data segmentation functions is determined by analyzing the terminal-area-used
may lead to the discovery of unknown groups of clusters units. Data transit requires a transmitter-receiver link, where
that can be noticed by humans. Clustering can be used for the transmitter connects to a certain endpoint. Congestion
outlier detection. The outliers are the values far away from control mechanisms are divided into a slow-start algorithm
any cluster. Many approaches of clustering can be used to and a congestion avoidance algorithm. The congestion
build clusters based on the method of selecting a group of control mechanism sends the initial message and awaits
objects. acknowledgment to monitor the congestion window and slow
Hierarchical methods based on a non-parametric clustering start threshold. The recipient sends an acknowledgment to
approach produce a dendrogram (a tree of clusters). These the transmitter, identifying the congestion window and slow
algorithms measure the dissimilarities among cluster sets for start threshold. As a result, congestion is controlled. The
each iteration. The hierarchical methods can be classified misplaced phase is retrieved if the recipient does not conduct
based on the form of the hierarchical decomposition into acknowledgment. If the congestion window indicator is less
divisive and agglomerative. The agglomerative or bottom-up than or equal to the slow start threshold, the slow start phase
approach forms the topmost group by grouping close objects. begins. Further related parametric settings and information
The other approach (divisive) or top-down approach starts can be found in [10].
with all objects that belong to the same cluster. The methods
of hierarchical clustering can be continuity-based or distance- IV. MODEL SETTING
based. However, when the split or merge step is performed in The proposed congestion prediction model has been illus-
hierarchical methods, it cannot be undone [75]. trated in Figure 2. The mmWave ns-3 module [81] and
Partitioning methods divide each dataset into several protocols were utilized to test network protocols, including
groups, where each group contains at least one object. In these TCP and SCTP, in the 5G environment [82]. This module
methods, the object must exactly belong to one group. Fuzzy for mmWave 5G cellular network simulation has many
partitioning is an example of these methods. Prohibitive characteristics, including providing the ability to study the
computations and exhaustive enumeration are required to cwnd [83]. It supports multiple channel models, including
achieve optimal clustering in partitioning methods. For that, 3GPP TR 38.901 for 0.5–100 GHz. It also provides adaptable
greedy methods, k-means, and k-medoids may be adopted PHY and MAC classes that support 3GPP NR frame structure
to overcome this obstacle and lead to building optimal and adaptable schedulers for dynamic TDD formats. Among
clusters. The partitioning methods work better with small to its main features is the possibility of improving the RLC
medium-sized data sets, where cluster building is based on layer with packet re-segmentation. The model supports quick
finding clusters with a spherical shape [76]. secondary cell handover, channel tracking, and dual LTE base
Density-based methods differ from the other clustering station connectivity.
methods by building clusters based on density, where the The utilized dataset in the proposed model, as shown in
cluster is built as long as the objects are in the same Table 1, is a 100-record dataset with five columns: sequence,
neighborhood (or the same density). The density-based congestion window size, throughput, queue size, and packet
methods divide the objects into a hierarchy or multiple loss. This dataset is divided into two sets: 80 record dataset
exclusive clusters. These methods are the optimal choice to for training and 20 record dataset for testing the model. More
find the outlier and noise. They are also optimal choices for enriched configurations are available in [10].
discovering the arbitrary shape of clusters [77].
Grid-based methods form the grid structure of the cluster TABLE 1. Dataset structure.
by quantizing the space of the objects into a limited number
of cells. All the operations that can be performed on the
clusters are performed on this grid structure. Fast processing
is the main advantage of this approach. The short processing
time results from the number of cells in the processed
dimension. The grid-based approach is optimal for spatial
datasets and can be used with other clustering approaches After the data visualization, the results show that there is
such as hierarchical and density-based methods [78]. no dirty, missing data, noise, or inconsistent data that needs
to be handled or cleaned. Based on that, the only step in the
data preprocessing is performing derived columns based on
B. CONGESTION CONTROL MECHANISM the columns. The derived column will be utilized as the goal
Congestion control mechanisms are crucial to the trans- column for the supervised ML. The associated parameters
port layer protocols. The transport protocol can perform used to derive the goal column are utilized to determine the
various functions, including message transmission, error optimal node for prediction. A simple mechanism is followed
detection, and message retrieval, by engaging throughout in determining the optimal node, where the optimal node is
this layer [15], [79], [80]. Functionality is matched to the node with high throughput, congestion window size, and
network utilization. The number of terminals needed for these queue size with the lowest packet loss.

VOLUME 12, 2024 91131


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

FIGURE 2. Congestion control prediction model.

The mechanism that has been followed to determine the which increase both the sum of all variances (3.67503E+11)
optimal and non-optimal nodes is set by comparing the value and the variances of total scores (3.69033E+11). These two
of congestion window size to determine if it is greater than high values result in Cronbach’s alpha score being very weak
the mean value of congestion window sizes, if throughput is due to huge variances in the same domain of each attribute,
greater than the mean value of throughput, and if the queue specifically in queue size, congestion window, packet loss,
size is greater than the mean value of queue sizes, as well as and throughput.
if the packet loss is less than the mean value of packet loss. Table 3 lists the performance criteria for implementing the
The optimal (O) nodes in the dataset are less than the non- supervised machine learning algorithms. The performance
optimal (N) nodes based on the mentioned mechanism. The criteria are: (True Positive (TP) rate, False Positive (FP)
labeled nodes (O and N) have been utilized in implementing rate, Precision, and Recall). The TP rate represents the
supervised ML in order to find the optimal algorithm for positive instances that are classified correctly, and the FP rate
prediction and classification. represents the false positive instances that are classified for
a given class. Precision is the ratio of the positive predicted
V. CONSISTENCY MEASUREMENT values, while recall represents sensitivity.
Cronbach’s alpha measures the internal consistency, undi-
TP
mensionality, or homogeneity of the dataset. The measure TP Rate = 100
is the variance of the item appearance in the dataset, and its TP + FN
value falls between 0 and 1. Cronbach’s alpha measures the where TP represents True Positive values and FN represents
inter-relatedness of the items in the same attribute domain of False Negative values.
the dataset [84], [85]. The Cronbach’s alpha can be calculated
as follows: FP
FP Rate = 100
P 2 FP + TN
k S
α= (1 − 2 i ) where FP represents False Positive values, TN represents
k −1 ST
True Negative values.
where k represents the number of items, Si2 represents the
variance of the ith item, and ST2 is the sum of all variances TP
Precision = 100
for all items. The results of implementing Cronbach’s alpha TP + FP
formula on the dataset are listed in Table 2. where FP represents False Positive values.
As shown in Table 2, the value of Cronbach’s alpha is
poor, which reflects the lack of consistency in the dataset. The TP
Recall = 100
reasons behind this are the variances in each column item, TP + FN

91132 VOLUME 12, 2024


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

TABLE 2. Cronbach’s alpha result.

TABLE 3. Performance criteria of supervised algorithms.

Receiver Operating Characteristic (ROC) and Area Under KStar, and Locally Weighted Learning (LWL) have been
Curve (AUC) are important criteria used for evaluating and examined. Based on the performance results, the top five
measuring the performance of machine learning algorithms. supervised machine learning algorithms have been selected
ROC and AUC are suitable for visualizing the performance of to visualize the best one among them. The comparison is
different classifiers in supervised and unsupervised learning implemented based on TP, FP, Precision, and Recall as a first
fields. step. The second step is to visualize the ROC and AUC of the
ROC takes a value between 0 and 1 that reflects the selected top five algorithms.
classifier’s accuracy. As much as the ROC value reaches Fig. 3 lists the performance criteria of the top five
value 1, the model becomes more accurate [56], [86]. The supervised algorithms (LMT, BaysNet, MultilyerPreceptron,
AUC takes a value from 0 to 1, where 1 indicates the SimpleLogistic, and IBK) as a chart. The LMT algorithm
classifier performance as perfectly accurate and 0 indicates came in first with (96.7%) followed by the remaining four
the classifier performance as perfectly inaccurate. A value algorithms with (95.7%) in predicting the TP values. LMT
falling between 0.7 and 0.8 is considered acceptable; an also came in as the top algorithm in low predicting rate
excellent value is a value between 0.8 and 0.9; and a value of FP values with (13.4%) followed by the remaining four
above 0.9 is considered an outstanding value. On the other algorithms with (13.6%). The LMT algorithm came in first
hand, the value that lies under 0.5 indicates that the classifier place in precision with (96.7%) followed by the remaining
performance is weak [71]. algorithms with (95.7%) and also in first place in recall with
Table 3 lists a comparison among different categories of (96.7%) followed by the remaining algorithms with (95.7%).
supervised learning, such as DT, where different algorithms The value of ROC is considered outstanding if it exceeds
such as Decision Stump, Hoeffding Tree, J48, LMT, Ran- the value of 0.9 [87], [88]. Based on that, all five algorithms
domForest, RandomTree, and RepTree are examined. The are outstanding at predicting instances. LMT came in first in
field of Bayes Net (BN) is also examined, and different ROC value with (0.988), followed by the BaysNet algorithm
algorithms are used, such as BayesNet, NaiveBayes, and with (0.987), MultilayerPreceprton and SimpleLogistic algo-
NaiveBayesUpdateable. In the regression category, Logistic, rithms with (0.981), and the IBK algorithm with (0.936).
Stochastic Gradient Descent (SGD), and Sequential Minimal The ROC of BaysNet with a value of (0.9873) came in first
Optimization (SMO) have been examined. Additionally, place, followed by the LMT algorithm with (0.9833), the
other algorithms and classifier categories, such as Neural MultilayerPreceptron and SimpleLogistic algorithms with
Network based algorithm (MultilayerPrecetron), K-Mean, (0.9815), and the IBK algorithm with (0.9357). The ROC

VOLUME 12, 2024 91133


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

FIGURE 3. Performance of supervised machine learning algorithms.

FIGURE 4. ROC of supervised machine learning algorithms.

and AUC have been shown in detail in Fig. 4 and Fig. 5, Farthest First algorithm with (94.1%), Canopy algorithm with
respectively. (83.2%), and Hierarchical Clusterer algorithm with (75.2%).
Table 4 lists the performance criteria of unsupervised The ROC and AUC have been shown in detail in Fig. 7 and 8,
machine learning algorithms, specifically in the clustering repectively.
approach. EM algorithm is the top accurate algorithm in pre- Fig. 6 represents the performance criteria of the top three
dicting the TP values with (98.5%) followed by Farthest First, unsupervised machine learning algorithms, namely: EM,
FilteredCluster, and SimpleKMean algorithms with scores Filtered Clusterer, and Simple K-Mean algorithms. Fig. 6
of (92.4%), MakeDensityBased algorithm with (91.3%), shows that the EM algorithm is the best algorithm for
HierarchicalCluster algorithm with (85.7%), and Canopy clustering with (98.5%), followed by both Simple K-Mean
algorithm with (83.7%). Based on FP rate, EM came in first and Filtered Clusterer algorithms with (92.4%) in predicting
with a low prediction rate of FP values with (0.2%), followed the TP values. EM came in first place for the low predicting
by Simple K-Mean and Filtered Clusterer algorithms with rate of FP values, as well as in precision and recall.
(1.3%), MakeDensityBased algorithm with (1.4%), Farthest As discussed earlier, the values of ROC and AUC reflect
First algorithm with (7.7%), Canopy algorithm with (54.1%), the accuracy of predicting the TP rate and FP rate. As shown
and Hierarchical Clusterer algorithm with (87%). Based on in Fig. 7 and Fig. 8, the ROC and AUC of both simple
the Precision criterion, EM came in first place with (98.6%), K-Mean and Filtered Clusterer algorithms are considered to
followed by SimpleKMean and Filtered Clusterer algorithms be outstanding since they exceed the 0.9 value with 0.956 for
with (95.1%), MakeDensityBased algorithm with (94.6%), ROC and 0.9557 for AUC, followed by the EM algorithm

91134 VOLUME 12, 2024


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

FIGURE 5. AUC of supervised machine learning algorithms.

TABLE 4. Performance criteria of unsupervised algorithms.

FIGURE 6. Performance of unsupervised machine learning algorithms.

with 0.852 for ROC and 0.611 for AUC. Since EM is using different information entropy equations. The attribute
considered to be the optimal algorithm for clustering based type, data set characteristics, size, and dimensionality affect
on the performance criteria discussed earlier, the AUC and the accuracy of the DT algorithms [89], [90], [91]. The data
ROC also support this concluded point. type of the predicted class makes the DT classifier prediction
accuracy high or low.
VI. MODEL PERFORMANCE OBSERVATION The LMT algorithm is based on two classification
The basic concept of attribute selection in DT relies on approaches: tree induction and logistic regression. This
the measures adopted for the selection method. DT adopts algorithm uses logistic regression for the leaves of the tree
information gain (IG) as an attribute selection method, where produced. As a result, the accuracy of the small dataset with
the attribute with the highest IG value is chosen as the a low number of attributes and no missing values will be
splitting node. IG is the average amount of information used very high. Moreover, the concept of building a network for
to classify the instances as a class label and is calculated by classification or feature selection in the Bayesian approach

VOLUME 12, 2024 91135


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

FIGURE 7. ROC of unsupervised machine learning algorithms.

should be found in a good clustering algorithm, such as


performing well with massive data, analyzing single and
mixtures of attribute types, and the ability of the algorithm
to deal with deviations (outliers) to enhance the quality of
the cluster. In addition, the results of the algorithm must be
usable, interpretable, and easy to understand. Another feature
of a good algorithm is its ability to operate with the lowest
requirements for input parameters to avoid bias in the result,
especially with higher dimensionalities and considerable
data. Another feature to be considered is the sensitivity of the
arrangement of inputs that can obtain different ideal results
when presented to algorithms in different arrangements
within the same data set. Finally, the optimal algorithm
selection also depends on the kind of data set and the objective
of the analysis [78], [94].
Since the clustering algorithm aims to be general, it is
FIGURE 8. AUC of unsupervised machine learning algorithms.
an important issue when selecting a clustering algorithm
that makes the shape correspond to the resulting cluster.
The clustering algorithms are biased toward determining the
relies on adopting probability to find the correlation among shapes and structures of the clusters, while it is not easy
features. For the continuous features domain, the numeric to determine the corresponding biased shape. The structure
attribute value is distributed, and then the distribution is of the cluster may not be determined, especially with the
represented later by its standard deviation and mean values. datasets that hold categorical data types. The amount of
The probability after that can easily be calculated to find dimension/attribute present in most datasets is huge. The
the correlation among attributes [92], [93]. Other algorithms, majority of the existing clustering algorithms are unable to
such as KNN and K star adopt probability and entropy manage anything greater than a small number of dimensions,
approaches to measure the distance among attributes based about eight to ten dimensions. Hence, the clustering of
on particular algorithms, such as IB1, 2, and 3, or even high-dimensional datasets is a challenge. An example of such
DT algorithms. Hence, the overall performance of the high-dimensional datasets is the US census dataset. The pres-
supervised algorithms is restricted by the type of the final ence of a huge number of attributes has proven to be the cause
class, whether it is a nominal or binary class, the type of dimensionality. This is associated with the following: (a)
of the attributes, and whether there are missing values in an increase in the number of attributes results in an increase in
the attributes, besides the previous characteristics of the the number of resources needed to represent their growth; (b)
dataset. for so many distance and distribution functions, the distance
Regarding unsupervised learning, in addition to metrics or of a given point from the nearest and furthest neighbor is
effects that increase the accuracy of the outputs, the selection almost the same. As a result of the increase in time needed
of the algorithm is the most important challenge faced in to process the data, both of the above-mentioned factors
order to obtain the best accuracy in results. Many properties significantly affect the efficiency of a clustering algorithm.

91136 VOLUME 12, 2024


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

Sequentially, the resulting clusters will have very poor Future research directions can include the implementation
quality [94]. of optimal supervised and unsupervised ML algorithms in a
real-world environment with stream data. Stacking can be
VII. CONCLUSION examined to combine the best and optimal algorithms in
Many models and mechanisms are proposed to overcome the proposed model to predict the optimal node with higher
congestion control problems and enhance the overall network accuracy and lower prediction time and resources.
performance. The proposed study handled the problem of
congestion control in the 5G environment by examining CONFLICT OF INTEREST
supervised and unsupervised ML algorithms to find the The authors declare that they have no conflict of interest.
optimal algorithm for predicting the optimal node.
In the field of supervised ML, twenty-six algorithms REFERENCES
were tested: seven DT algorithms, three BN and lazy [1] R. Dangi, P. Lalwani, G. Choudhary, I. You, and G. Pau, ‘‘Study and
algorithms, five rules algorithms, and eight other algorithms. investigation on 5G technology: A systematic review,’’ Sensors, vol. 22,
no. 1, p. 26, Dec. 2021.
In the field of unsupervised ML, seven clustering algorithms [2] N. Al-Falahy and O. Y. Alani, ‘‘Technologies for 5G networks: Challenges
are examined. Cronbach’s alpha results showed that it is and opportunities,’’ IT Prof., vol. 19, no. 1, pp. 12–20, Jan. 2017.
impossible to measure the consistency due to the huge [3] S. Malathy, P. Jayarajan, M. H. D. N. Hindia, V. Tilwari, K. Dimyati,
K. A. Noordin, and I. S. Amiri, ‘‘Routing constraints in the device-to-
variance in the data set columns’ domains. This variance device communication for beyond IoT 5G networks: A review,’’ Wireless
makes the prediction based on the changing data difficult. Netw., vol. 27, no. 5, pp. 3207–3231, Jul. 2021.
Many conditions determine optimal congestion window [4] J. Lorincz, Z. Klarin, and J. Ožegović, ‘‘A comprehensive overview of
TCP congestion control in 5G networks: Research challenges and future
management, such as low packet loss, high congestion perspectives,’’ Sensors, vol. 21, no. 13, p. 4510, Jun. 2021.
window, high queue size, and high throughput. [5] B. Hindawi and A. S. Abbas, ‘‘Congestion control techniques in 5G mm
Since it is difficult to measure the performance of all wave networks: A review,’’ in Proc. 1st Babylon Int. Conf. Inf. Technol.
Sci. (BICITS), Apr. 2021, pp. 305–310.
supervised algorithms by charts, only the top five supervised
[6] G. Dandachi, A. De Domenico, D. T. Hoang, and D. Niyato, ‘‘An artificial
algorithms were discussed based on their performance intelligence framework for slice deployment and orchestration in 5G
criteria, namely: LMT, BaysNet, MultilyerPreceptron, Sim- networks,’’ IEEE Trans. Cogn. Commun. Netw., vol. 6, no. 2, pp. 858–871,
Jun. 2020.
pleLogistic, and IBK. The LMT algorithm came in first with
[7] S. Douch, M. R. Abid, K. Zine-Dine, D. Bouzidi, and D. Benhaddou,
(96.7%) followed by the remaining four algorithms with ‘‘Edge computing technology enablers: A systematic lecture study,’’ IEEE
(95.7%) in predicting the TP values. LMT also came in first Access, vol. 10, pp. 69264–69302, 2022.
place in the low predicting rate of FP values with (13.4%) [8] Y. B. Zikria, S. W. Kim, M. K. Afzal, H. Wang, and M. H. Rehmani, ‘‘5G
mobile services and scenarios: Challenges and solutions,’’ Sustainability,
followed by the remaining four algorithms with (13.6%). The vol. 10, no. 10, p. 3626, Oct. 2018.
LMT algorithm came in first place in precision with (96.7%) [9] S. Han, ‘‘Congestion-aware WiFi offload algorithm for 5G heterogeneous
followed by the remaining algorithms with (95.7%) and also wireless networks,’’ Comput. Commun., vol. 164, pp. 69–76, Dec. 2020.
[10] I. A. Najm, A. K. Hamoud, J. Lloret, and I. Bosch, ‘‘Machine
in first place in recall with (96.7%) followed by the remaining learning prediction approach to enhance congestion control in 5G IoT
algorithms with (95.7%). The TP rate and FP rate are so close, environment,’’ Electronics, vol. 8, no. 6, p. 607, May 2019.
due that, the ROC and AUC are measured for all algorithms [11] R. Sathya and A. Abraham, ‘‘Comparison of supervised and unsupervised
learning algorithms for pattern classification,’’ Int. J. Adv. Res. Artif. Intell.,
to find the optimal one. LMT came in first in ROC value vol. 2, no. 2, pp. 34–38, 2013.
with (0.988) followed by the BaysNet algorithm with (0.987), [12] I. Khan, M. Zafar, M. Jan, J. Lloret, M. Basheri, and D. Singh, ‘‘Spectral
MultilayerPreceprton and SimpleLogistic algorithms with and energy efficient low-overhead uplink and downlink channel estimation
for 5G massive MIMO systems,’’ Entropy, vol. 20, no. 2, p. 92, Jan. 2018.
(0.981), and IBK algorithm with (0.936). The ROC of [13] G. Sangeetha, M. Vijayalakshmi, S. Ganapathy, and A. Kannan, ‘‘A
BaysNet with a value (0.9873) came in first place, followed heuristic path search for congestion control in WSN,’’ in Proc. Int. Conf.
by the LMT algorithm with (0.9833), MultilayerPreceptron Ind. Interact. Innov. Sci., Eng. Technol. (I3SET). Singapore: Springer,
2016, pp. 485–495.
and SimpleLogistic algorithms with (0.9815), and the IBK [14] K. Singh, K. Singh, L. H. Son, and A. Aziz, ‘‘Congestion control
algorithm with (0.9357). in wireless sensor networks by hybrid multi-objective optimization
In unsupervised ML, the performance criteria of the algorithm,’’ Comput. Netw., vol. 138, pp. 90–107, Jun. 2018.
[15] S. F. Ahmed, M. S. B. Alam, S. Afrin, S. J. Rafa, S. B. Taher, M. Kabir,
top three algorithms, namely: EM, Filtered Clusterer, and
S. M. Muyeen, and A. H. Gandomi, ‘‘Toward a secure 5G-enabled Internet
Simple K-Mean, were measured. The EM algorithm was of Things: A survey on requirements, privacy, security, challenges, and
the best algorithm for clustering with (98.5%) followed by opportunities,’’ IEEE Access, vol. 12, pp. 13125–13145, 2024.
both Simple K-Mean and Filtered Clusterer algorithms with [16] S. Urooj, R. Arunachalam, M. A. Alawad, K. N. Tripathi, D. Sukumaran,
and P. Ilango, ‘‘An effective model for network selection and resource
(92.4%) in predicting the TP values. EM came in first place allocation in 5G heterogeneous network using hybrid heuristic-assisted
for the low predicting rate of FP values and first in precision multi-objective function,’’ Expert Syst. Appl., vol. 248, Aug. 2024,
and recall. The values of ROC and AUC reflect the accuracy Art. no. 123307.
[17] R. MacDavid, X. Chen, and J. Rexford, ‘‘Scalable real-time bandwidth fair-
of predicting the TP rate and FP rate. The ROC and AUC ness in switches,’’ IEEE/ACM Trans. Netw., vol. 32, no. 2, pp. 1423–1434,
of both simple K-Mean and Filtered Clusterer algorithms Apr. 2024.
were outstanding, with 0.956 for ROC and 0.9557 for AUC, [18] M.-R. Fida, A. H. Ahmed, T. Dreibholz, A. F. Ocampo, A. Elmokashfi,
and F. I. Michelinakis, ‘‘Bottleneck identification in cloudified mobile
followed by the EM algorithm with 0.852 for ROC and networks based on distributed telemetry,’’ IEEE Trans. Mobile Comput.,
0.611 for AUC. vol. 23, no. 5, pp. 5660–5676, May 2024.

VOLUME 12, 2024 91137


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

[19] M. Shelke, A. Malhotra, and P. N. Mahalle, ‘‘Congestion-aware oppor- [39] A. Sunny, S. Panchal, N. Vidhani, S. Krishnasamy, S. V. R. Anand,
tunistic routing protocol in wireless sensor networks,’’ in Proc. 1st Int. M. Hegde, J. Kuri, and A. Kumar, ‘‘A generic controller for managing TCP
Conf. Smart Comput. Inform. (SCI), vol. 1. Singapore: Springer, 2016, transfers in IEEE 802.11 infrastructure WLANs,’’ J. Netw. Comput. Appl.,
pp. 63–72. vol. 93, pp. 13–26, Sep. 2017.
[20] P. D. Godoy, R. L. Cayssials, and C. G. García Garino, ‘‘Communication [40] R. Katuwal, P. N. Suganthan, and L. Zhang, ‘‘An ensemble of decision
channel occupation and congestion in wireless sensor networks,’’ Comput. trees with random vector functional link networks for multi-class
Electr. Eng., vol. 72, pp. 846–858, Nov. 2018. classification,’’ Appl. Soft Comput., vol. 70, pp. 1146–1153, Sep. 2018.
[21] I. A. Najm, M. Ismail, J. Lloret, K. Z. Ghafoor, B. B. Zaidan, and [41] S. E. Gómez, B. C. Martínez, A. J. Sánchez-Esguevillas, and
A. A.-R.-T. Rahem, ‘‘Improvement of SCTP congestion control in the L. Hernández Callejo, ‘‘Ensemble network traffic classification:
LTE-A network,’’ J. Netw. Comput. Appl., vol. 58, pp. 119–129, Dec. 2015. Algorithm comparison and novel ensemble scheme proposal,’’ Comput.
[22] S. Brahma, M. Chatterjee, and K. Kwiat, ‘‘Congestion control and Netw., vol. 127, pp. 68–80, Nov. 2017.
fairness in wireless sensor networks,’’ in Proc. 8th IEEE Int. Conf. Per- [42] B. Leng, L. Huang, C. Qiao, and H. Xu, ‘‘A decision-tree-based on-line
vasive Comput. Commun. Workshops (PERCOM Workshops), Mar. 2010, flow table compressing method in software defined networks,’’ in Proc.
pp. 413–418. IEEE/ACM 24th Int. Symp. Quality Service (IWQoS), Jun. 2016, pp. 1–2.
[23] I. A. Najm, J. M. Dahr, A. K. Hamoud, A. S. Hashim, W. A. Awadh, [43] D. Liang, Z. Zhang, and M. Peng, ‘‘Access point reselection and adaptive
M. B. M. Kamel, and A. M. Humadi, ‘‘OLAP mining with educational cluster splitting-based indoor localization in wireless local area networks,’’
data mart to predict students’ performance,’’ Informatica, vol. 46, no. 5, IEEE Internet Things J., vol. 2, no. 6, pp. 573–585, Dec. 2015.
pp. 11–19, Mar. 2022. [44] M. Hasan, E. Hossain, and D. Niyato, ‘‘Random access for machine-
[24] J. M. Dahr, A. K. Hamoud, I. A. Najm, and M. I. Ahmed, ‘‘Implementing to-machine communication in LTE-advanced networks: Issues and
sales decision support system using data mart based on OLAP, KPI, and approaches,’’ IEEE Commun. Mag., vol. 51, no. 6, pp. 86–93, Jun. 2013.
data mining approaches,’’ J. Eng. Sci. Technol., vol. 17, no. 1, pp. 275–293, [45] Y. Liu and H. Wu, ‘‘Prediction of road traffic congestion based on random
2022. forest,’’ in Proc. 10th Int. Symp. Comput. Intell. Design (ISCID), vol. 2,
[25] M. Al-Asfoor and M. H. Abed, ‘‘Deep learning approach for COVID- Dec. 2017, pp. 361–364.
19 diagnosis using X-ray images,’’ in Proc. Int. Conf. Inf. Technol. Appl.
[46] H. Park, A. Haghani, S. Samuel, and M. A. Knodler, ‘‘Real-time prediction
(ICITA). Singapore: Springer, 2021, pp. 161–170.
and avoidance of secondary crashes under unexpected traffic congestion,’’
[26] H. K. Naji, H. K. Fatlawi, A. J. M. Karkar, N. Goga, A. Kiss, Accident Anal. Prevention, vol. 112, pp. 39–49, Mar. 2018.
and A. T. Al-Rawi, ‘‘Prediction of COVID-19 patients recovery using
[47] J. Shu, S. Liu, L. Liu, L. Zhan, and G. Hu, ‘‘Research on link quality
ensemble machine learning and vital signs data collected by novel wearable
estimation mechanism for wireless sensor networks based on support
device,’’ Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 7, pp. 1–10, 2022.
vector machine,’’ Chin. J. Electron., vol. 26, no. 2, pp. 377–384, Mar. 2017.
[27] H. K. Fatlawi and A. Kiss, ‘‘An adaptive classification model for predicting
[48] A. C. Riekstin, G. C. Januário, B. B. Rodrigues, V. T. Nascimento,
epileptic seizures using cloud computing service architecture,’’ Appl. Sci.,
T. C. M. B. Carvalho, and C. Meirosu, ‘‘Orchestration of energy efficiency
vol. 12, no. 7, p. 3408, Mar. 2022.
capabilities in networks,’’ J. Netw. Comput. Appl., vol. 59, pp. 74–87,
[28] A. K. Hamoud, A. S. Alasady, W. A. Awadh, J. M. Dahr, M. B. M. Kamel,
Jan. 2016.
A. M. Humadi, and I. A. Najm, ‘‘A comparative study of super-
vised/unsupervised machine learning algorithms with feature selection [49] S. Soltani and M. W. Mutka, ‘‘Decision tree modeling for video routing
approaches to predict student performance,’’ Int. J. Data Mining, Model. in cognitive radio mesh networks,’’ in Proc. IEEE 14th Int. Symp. World
Manage., vol. 15, no. 4, pp. 393–409, 2023. Wireless, Mobile Multimedia Networks (WoWMoM), Jun. 2013, pp. 1–9.
[29] H. K. Fatlawi and A. Kiss, ‘‘Handling delayed labeling of EEG data [50] T. Stimpfling, N. Bélanger, O. Cherkaoui, A. Béliveau, L. Béliveau,
stream using semi-supervised label propagation,’’ in Proc. 15th Int. Conf. and Y. Savaria, ‘‘Extensions to decision-tree based packet classification
Electron., Comput. Artif. Intell. (ECAI), Jun. 2023, pp. 1–5. algorithms to address new classification paradigms,’’ Comput. Netw.,
vol. 122, pp. 83–95, Jul. 2017.
[30] A. K. Hamoud, M. B. M. Kamel, A. S. Gaafar, A. S. Alasady,
A. M. Humadi, W. A. Awadh, and J. M. Dahr, ‘‘A prediction model [51] D. Singh, S. P. Nigam, V. P. Agrawal, and M. Kumar, ‘‘Vehicular traffic
based machine learning algorithms with feature selection approaches over noise prediction using soft computing approach,’’ J. Environ. Manage.,
imbalanced dataset,’’ Indonesian J. Electr. Eng. Comput. Sci., vol. 28, no. 2, vol. 183, pp. 59–66, Dec. 2016.
p. 1105, Nov. 2022. [52] Y. Xia, W. Chen, X. Liu, L. Zhang, X. Li, and Y. Xiang, ‘‘Adaptive
[31] S. Al-yousif, A. Jaenul, W. Al-Dayyeni, A. Alamoodi, I. Najm, multimedia data forwarding for privacy preservation in vehicular ad-
N. M. Tahir, A. A. A. Alrawi, Z. Cömert, N. A. Al-shareefi, and hoc networks,’’ IEEE Trans. Intell. Transp. Syst., vol. 18, no. 10,
A. H. Saleh, ‘‘A systematic review of automated pre-processing, feature pp. 2629–2641, Oct. 2017.
extraction and classification of cardiotocography,’’ PeerJ Comput. Sci., [53] E. Adi, Z. Baig, and P. Hingston, ‘‘Stealthy denial of service (DoS) attack
vol. 7, pp. 1–37, Apr. 2021. modelling and detection for HTTP/2 services,’’ J. Netw. Comput. Appl.,
[32] I. A. Najm, M. Ismail, T. Rahem, and A. Al-Razak, ‘‘Wireless implemen- vol. 91, pp. 1–13, Aug. 2017.
tation selection in higher institution learning environment,’’ J. Theor. Appl. [54] B. Tierney, E. Kissel, M. Swany, and E. Pouyoul, ‘‘Efficient data transfer
Inf. Technol., vol. 67, no. 2, pp. 477–484, 2014. protocols for big data,’’ in Proc. IEEE 8th Int. Conf. E-Sci., Oct. 2012,
[33] H. K. Fatlawi and A. Kiss, ‘‘An elastic self-adjusting technique for rare- pp. 1–9.
class synthetic oversampling based on cluster distortion minimization in [55] M. F. Mohamed, A. E.-R. Shabayek, M. El-Gayyar, and H. Nassar, ‘‘An
data stream,’’ Sensors, vol. 23, no. 4, p. 2061, Feb. 2023. adaptive framework for real-time data reduction in AMI,’’ J. King Saud
[34] P. Geurts, I. El Khayat, and G. Leduc, ‘‘A machine learning approach to Univ. Comput. Inf. Sci., vol. 31, no. 3, pp. 392–402, Jul. 2019.
improve congestion control over wireless computer networks,’’ in Proc. [56] T. N. D. Pham and C. K. Yeo, ‘‘Adaptive trust and privacy management
4th IEEE Int. Conf. Data Mining (ICDM), Nov. 2004, pp. 383–386. framework for vehicular networks,’’ Veh. Commun., vol. 13, pp. 1–12,
[35] S. Jagannathan and K. C. Almeroth, ‘‘Using tree topology for multicast Jul. 2018.
congestion control,’’ in Proc. Int. Conf. Parallel Process., Sep. 2001, [57] Z. Md. Fadlullah, F. Tang, B. Mao, N. Kato, O. Akashi, T. Inoue,
pp. 313–320. and K. Mizutani, ‘‘State-of-the-art deep learning: Evolving machine
[36] X. Zhang, J. Zuo, Z. Huang, Z. Zhou, X. Chen, and C. Joe-Wong, intelligence toward tomorrow’s intelligent network traffic control sys-
‘‘Learning with side information: Elastic multi-resource control for the tems,’’ IEEE Commun. Surveys Tuts., vol. 19, no. 4, pp. 2432–2455,
open RAN,’’ IEEE J. Sel. Areas Commun., vol. 42, no. 2, pp. 295–309, 4th Quart., 2017.
Feb. 2024. [58] Y. Kong, H. Zang, and X. Ma, ‘‘Improving TCP congestion control with
[37] V. Murgai, V. Kanakaraj, and I. Kommineni, ‘‘AI in the wireless 5G core machine intelligence,’’ in Proc. Workshop Netw. Meets AI ML NetAI, 2018,
(5GC),’’ in AI in Wireless for Beyond 5G Networks (5GC). Boca Raton, pp. 60–66.
FL, USA: CRC Press, 2023, pp. 147–154. [59] N. Taherkhani and S. Pierre, ‘‘Centralized and localized data congestion
[38] Y. Watanabe, Y. Kawamoto, and N. Kato, ‘‘A novel routing control method control strategy for vehicular ad hoc networks using a machine learning
using federated learning in large-scale wireless mesh networks,’’ IEEE clustering algorithm,’’ IEEE Trans. Intell. Transp. Syst., vol. 17, no. 11,
Trans. Wireless Commun., vol. 22, no. 12, pp. 9291–9300, Dec. 2023. pp. 3275–3285, Nov. 2016.

91138 VOLUME 12, 2024


B. M. Mohammed Kamel et al.: Congestion Control Prediction Model

[60] Y.-Y. Chen, Y. Lv, Z. Li, and F.-Y. Wang, ‘‘Long short-term memory model [86] A. Khalaf, A. Majeed, W. Akeel, and A. Salah, ‘‘Students’ success
for traffic congestion prediction with online open data,’’ in Proc. IEEE 19th prediction based on Bayes algorithms,’’ Int. J. Comput. Appl., vol. 178,
Int. Conf. Intell. Transp. Syst. (ITSC), Nov. 2016, pp. 132–137. no. 7, pp. 6–12, Nov. 2017.
[61] F. Tariq and S. Baig, ‘‘Multiclass machine learning based botnet detection [87] J. N. Mandrekar, ‘‘Receiver operating characteristic curve in diagnostic test
in software defined networks,’’ Int. J. Comput. Sci. Netw. Secur., vol. 19, assessment,’’ J. Thoracic Oncol., vol. 5, no. 9, pp. 1315–1316, Sep. 2010.
no. 3, p. 150, 2019. [88] D. W. Hosmer Jr., S. Lemeshow, and R. X. Sturdivant, Applied Logistic
[62] T. Wu, S. Petrangeli, R. Huysegems, T. Bostoen, and F. De Turck, Regression, vol. 398. Hoboken, NJ, USA: Wiley, 2013.
‘‘Network-based video freeze detection and prediction in HTTP adaptive [89] H. Uğuz, ‘‘A two-stage feature selection method for text categorization
streaming,’’ Comput. Commun., vol. 99, pp. 37–47, Feb. 2017. by using information gain, principal component analysis and genetic
[63] T. Abar, A. Ben Letaifa, and S. El Asmi, ‘‘Machine learning based QoE algorithm,’’ Knowl.-Based Syst., vol. 24, no. 7, pp. 1024–1032, Oct. 2011.
prediction in SDN networks,’’ in Proc. 13th Int. Wireless Commun. Mobile [90] C. Lee and G. G. Lee, ‘‘Information gain and divergence-based feature
Comput. Conf. (IWCMC), Jun. 2017, pp. 1395–1400. selection for machine learning-based text categorization,’’ Inf. Process.
[64] J. Han, J. Pei, and H. Tong, Data Mining: Concepts and Techniques. Manage., vol. 42, no. 1, pp. 155–165, Jan. 2006.
San Mateo, CA, USA: Morgan Kaufmann, 2022. [91] B. Suri, Mani, and M. Kumar, ‘‘Performance evaluation of data
[65] C. Crisci, B. Ghattas, and G. Perera, ‘‘A review of supervised machine mining techniques,’’ in Proc. Inf. Commun. Technol. for Sustain.
learning algorithms and their applications to ecological data,’’ Ecol. Develop. (ICT4SD), vol. 1. Singapore: Springer, 2016, pp. 375–383.
Model., vol. 240, pp. 113–122, Aug. 2012. [92] G. H. John and P. Langley, ‘‘Estimating continuous distributions in
[66] N. Ye, Data Mining: Theories, Algorithms, and Examples. Boca Raton, Bayesian classifiers,’’ 2013, arXiv:1302.4964.
FL, USA: CRC Press, 2013. [93] Q. Wang, G. M. Garrity, J. M. Tiedje, and J. R. Cole, ‘‘Naiïve Bayesian
[67] B. Çığşar and D. Ünal, ‘‘Comparison of data mining classification classifier for rapid assignment of rRNA sequences into the new bacterial
algorithms determining the default risk,’’ Sci. Program., vol. 2019, pp. 1–8, taxonomy,’’ Appl. Environ. Microbiol., vol. 73, no. 16, pp. 5261–5267,
Feb. 2019. Aug. 2007.
[68] A. Hamoud, ‘‘Selection of best decision tree algorithm for prediction and [94] O. J. Oyelade, O. O. Oladipupo, and I. C. Obagbuwa, ‘‘Application
classification of students’ action,’’ Amer. Int. J. Res. Sci., Technol., Eng. of k means clustering algorithm for prediction of students academic
Math., vol. 16, no. 1, pp. 26–32, 2016. performance,’’ 2010, arXiv:1002.2425.
[69] A. Hamoud, ‘‘Applying association rules and decision tree algorithms with
tumor diagnosis data,’’ Int. Res. J. Eng. Technol., vol. 3, no. 8, pp. 27–31,
2017.
[70] A. K. Hamoud, A. S. Hashim, and W. A. Awadh, ‘‘Predicting student MOHAMMED B. M. KAMEL received the
performance in higher education institutions using decision tree analysis,’’ Ph.D. degree in computer science from Eötvös
Int. J. Interact. Multimedia Artif. Intell., vol. 5, no. 2, pp. 26–31, 2018. Loránd University and Furtwangen University.
[71] A. K. Hamoud and A. M. Humadi, ‘‘Student’s success prediction model He is currently a Senior Researcher and a Certified
based on artificial neural networks (ANN) and a combination of feature Cybersecurity Consultant. He was a part of several
selection methods,’’ J. Southwest Jiaotong Univ., vol. 54, no. 3, pp. 1–19, projects as a cybersecurity researcher that have
Jun. 2019. been accomplished. His research interest includes
[72] A. K. Jain, J. Mao, and K. M. Mohiuddin, ‘‘Artificial neural networks: A designing distributed secure protocols. He was a
tutorial,’’ Computer, vol. 29, no. 3, pp. 31–44, Mar. 1996. Gold and Bronze Award Winner from EIT, a body
[73] W. S. Noble, ‘‘What is a support vector machine?’’ Nature Biotechnol., of the European Union.
vol. 24, no. 12, pp. 1565–1567, Dec. 2006.
[74] G. Fung, ‘‘A comprehensive overview of basic clustering algorithms,’’
Dept. Comput. Sci., Univ. Wisconsin, Madison, WI, USA, Tech. Rep.,
2001. [Online]. Available: https://pages.cs.wisc.edu/~gfung/ IHAB AHMED NAJM received the Diploma
[75] S. C. Johnson, ‘‘Hierarchical clustering schemes,’’ Psychometrika, vol. 32, degree from the Technical University of Berlin,
no. 3, pp. 241–254, Sep. 1967. Berlin, Germany, in 2012, the B.S. and M.S.
[76] Z. Zhang, J. Zhang, and H. Xue, ‘‘Improved K-means cluster- degrees in information technology from the
ing algorithm,’’ in Proc. Congr. Image Signal Process., May 2008, Utara University of Kedah Malaysia, in 2012,
pp. 169–172. and the Ph.D. degree from the Department
[77] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, ‘‘A density-based algorithm of Electrical, Electronic and Systems Engineer-
for discovering clusters in large spatial databases with noise,’’ in Proc. ing, National University of Malaysia, in 2017.
KDD, 1996, vol. 96, no. 34, pp. 226–231. From 2004 to 2007, he was a Laboratory Instructor
[78] P. Rai and S. Singh, ‘‘A survey of clustering techniques,’’ Int. J. Comput. with the Department of Computer Science, Uni-
Appl., vol. 7, no. 12, pp. 1–5, Oct. 2010. versity of Tikrit. He is the author of more than 20 articles. His research
[79] V. R. Gannapathy, R. Nordin, N. F. Abdullah, and A. Abu-Samah, ‘‘A smart interests include broadcasting, postal communication, tele-education, tele-
handover strategy for 5G mmWave dual connectivity networks,’’ IEEE working, multiservice convergence networks, cellular networks and the IoT,
Access, vol. 11, pp. 134739–134759, 2023. data science, machine learning and deep learning, internet protocol, short-
[80] V. K. Quy, A. Chehri, N. M. Quy, N. D. Han, and N. T. Ban, range communication, VoIP, teleport and tele-health, digital communications
‘‘Innovative trends in the 6G era: A comprehensive survey of architecture, networks, area of expertise smart mobile wireless communication, and future
applications, technologies, and challenges,’’ IEEE Access, vol. 11,
internet protocols.
pp. 39824–39844, 2023.
[81] M. Mezzavilla, M. Zhang, M. Polese, R. Ford, S. Dutta, S. Rangan,
and M. Zorzi, ‘‘End-to-end simulation of 5G mmWave networks,’’ IEEE
Commun. Surveys Tuts., vol. 20, no. 3, pp. 2237–2263, 3rd Quart., 2018.
ALAA KHALAF HAMOUD received the B.Sc.
[82] M. Rebato, M. Polese, and M. Zorzi, ‘‘Multi-sector and multi-panel
and M.Sc. degrees (Hons.) from the Department
performance in 5G mmWave cellular networks,’’ in Proc. IEEE Global
Commun. Conf. (GLOBECOM), Dec. 2018, pp. 1–6. of Computer Science, University of Basrah, Iraq,
[83] A. A. Oliveira, D. Batista, and R. Hirata, ‘‘Exploring the ns-3 mmWave in 2008 and 2014, respectively. He is currently an
module,’’ Dept. Comput. Sci., Univ. São Paulo, São Paulo, Brazil, Assistant Professor with the Department of Cyber-
Tech. Rep., 2019, p. 23. [Online]. Available: http://vision.ime.usp. security, University of Basrah. He participated
br/~arturao/ in (seven months) IT administration course at
[84] J. M. Bland and D. G. Altman, ‘‘Statistics notes: Cronbach’s alpha,’’ BMJ, the Technical University of Berlin, Germany. His
vol. 314, no. 7080, p. 572, 7080. scientific research interests include data mining
[85] M. Tavakol and R. Dennick, ‘‘Making sense of Cronbach’s alpha,’’ Int. and data warehousing.
J. Med. Educ., vol. 2, pp. 53–55, Jun. 2011.

VOLUME 12, 2024 91139

You might also like