0% found this document useful (0 votes)
58 views

ML-Based Radio Resource Management in 5G and Beyond Networks A Survey

This document provides a survey of machine learning techniques for radio resource management in 5G and beyond networks. It discusses how machine learning and mobile edge computing can help address the increasing demands on network management by providing improved quality of service. The document categorizes machine learning approaches for various radio resource management sub-problems and applies different machine learning methods to the task of throughput prediction as an example.

Uploaded by

werom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

ML-Based Radio Resource Management in 5G and Beyond Networks A Survey

This document provides a survey of machine learning techniques for radio resource management in 5G and beyond networks. It discusses how machine learning and mobile edge computing can help address the increasing demands on network management by providing improved quality of service. The document categorizes machine learning approaches for various radio resource management sub-problems and applies different machine learning methods to the task of throughput prediction as an example.

Uploaded by

werom
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Received 24 June 2022, accepted 31 July 2022, date of publication 5 August 2022, date of current version 12 August 2022.

Digital Object Identifier 10.1109/ACCESS.2022.3196657

ML-Based Radio Resource Management in 5G


and Beyond Networks: A Survey
IOANNIS A. BARTSIOKAS 1 , PANAGIOTIS K. GKONIS 2 ,
DIMITRA I. KAKLAMANI1 , AND IAKOVOS S. VENIERIS3
1 Microwave and Fiber Optics Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens,
Zografou, 15780 Athens, Greece
2 Department of Digital Industry Technologies, National and Kapodistrian University of Athens, Sterea Ellada, 34400 Dirfies Messapies, Greece
3 Intelligent Communications and Broadband Networks Laboratory, School of Electrical and Computer Engineering, National Technical University of Athens,

Zografou, 15780 Athens, Greece


Corresponding author: Ioannis A. Bartsiokas ([email protected])

ABSTRACT In this survey, a comprehensive study is provided, regarding the use of machine learning (ML)
algorithms for effective resource management in fifth-generation and beyond (5G/B5G) wireless cellular
networks. The ever-increasing user requirements, their diverse nature in terms of performance metrics
and the use of various novel technologies, such as millimeter wave transmission, massive multiple-input-
multiple-output configurations and non-orthogonal multiple access, render the multi-constraint nature of
the radio resource management (RRM) problem. In this context, ML and mobile edge computing (MEC)
constitute a promising framework to provide improved quality of service (QoS) for end users, since
they can relax the RMM-associated computational burden. In our work, a state-of-the-art analysis of
ML-based RRM algorithms, categorized in terms of learning type and potential applications as well as MEC
implementations,is presented, to define the best-performing solutions for various RRM sub-problems. To
demonstrate the capabilities and efficiency of ML-based algorithms in RRM, we apply and compare different
ML approaches for throughput prediction, as an indicative RRM task. We investigate the problem, either as a
classification or as a regression one, using the corresponding metrics in each occasion. Finally, open issues,
challenges and limitations concerning AI/ML approaches in RRM for 5G and B5G networks, are discussed
in detail.

INDEX TERMS 5G, B5G, deep learning, machine learning, mobile edge computing, radio resource
management.

ACRONYMS BER Bit Error Rate.


3GPP Third Generation Partnership Project. BP Blocking Probability.
4G 4th Generation. BS Base Station.
5G 5th Generation. CDMA Code Division Multiple Access.
6G 6th Generation. CIR Channel Impulse Response.
ABC Artificial Bee Colony. CN Core Network.
AI Artificial Intelligence. CNN Convolutional Neural Network.
AM Amplitude Modulation. CRAN Cloud RAN.
ANN Artificial Neural Networks. CSI Channel State Information.
B5G Beyond 5th Generation. D2D Device-to-Device.
BBU Baseband Processing Unit. DL Deep Learning.
DL/UL Down/Up Link.
DNN Deep Neural Networks.
The associate editor coordinating the review of this manuscript and EE Energy Efficiency.
approving it for publication was Li Minn Ang . EMF Electromagnetic Field.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 10, 2022 83507
I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

eMBB Enhanced Mobile Broadband. RSRQ Reference Signal Received Quality.


ETSI European Telecommunications Standards RSRP Reference Signal Received Power.
Institute. RSSI Received Signal Strength Indicator.
FC Femto-Cell. SE Spectral Efficiency.
FL Federated Learning. SNR Signal-to-Noise-Ratio.
FM Frequency Modulation. SON Self-Organizing Network.
FR Frequency Range. SU-MIMO Single-User MIMO.
gNodeB Next Generation Node B. SVM Support Vector Machines.
H2H Human-to-Human. TDMA Time Division Multiple Access.
HetNets Heterogenous Networks. UAVs Unmanned Aerial Vehicles.
IoT Internet of Things. UE User Equipment.
IP Internet Protocol. UHF Ultra High Frequency Band.
ITU International Telecommunication Union. URLLC Ultra-Reliable-Low-Latency
k-NN k-Nearest Neighbors. Communication.
LTE Long Term Evolution. V2M Vehicle-to-Machine.
M2M Machine-to-Machine. V2V Vehicle-to-Vehicle.
MA Margin Adaptive. WMMSE Weighted Minimum Mean Squared Error.
MARL Multi-Agent Reinforcement Learning. WWWW World-Wide-Wireless-Web.
MCTS Monte Carlo Tree Search.
MEC Mobile Edge Computing.
MIMO Multiple-Input-Multiple-Output. I. INTRODUCTION
MINLP Mixed Integer Non-linear Programming. A. THE EMERGING ROLE OF MACHINE LEARNING IN 5G
ML Machine Learning. The development of fifth-generation (5G) broadband wireless
MOS Mean Opinion Score. networks has been nowadays significantly accelerated and
mMTC Massive Machine-Type Communications. is worldwide at the stage of installation and implementa-
mmWave Millimeter Wave. tion of the backbone network [1], [2]. Moreover, mobile
m-MIMO Massive MIMO. network operators (MNOs) gradually launch in the market
MNOs Mobile Network Operators. terminal devices (mobile phones, boards, etc.) that support
MU-MIMO Multi-User MIMO. 5G networks. According to the studies in [3], [4], the monthly
MTC Mobile Type Communications. data demand will reach 100 exabytes with about 31.6 billion
NLP Natural Language Processing. active devices by 2023, thus doubling the current require-
NOMA Non-Orthogonal Multiple Access. ments. Similar to [3], an updated whitepaper from CISCO is
non-IID Non-Independent and Identical Distribution. expected within 2022, also predicting increased data demand
NP Non-Deterministic Polynomial-Time. until 2024. In this context, the necessity for optimal solutions,
NR New Radio. in terms of network management and allocation of available
OFDM Orthogonal Frequency Division radio resources, is apparent.
Multiplexing. It is already visible that 5G acts as an integrator for diverse
OFDMA Orthogonal Frequency Division applications and services. To this end, 5G networks uti-
Multiple Access. lize vehicular communications [5], device-to-device (D2D)
O-RAN Open Radio Access Network. communications [6], machine-to-machine (M2M) communi-
OTA Over-the-Air. cations [7], mobile edge computing (MEC) [8], cloud com-
P2P Point-to-Point. puting [9] and internet of things (IoT) [10], in order to meet
PF Proportional Fairness. the needs for enhanced mobile broadband (eMBB), massive
PRB Physical Resource Block. machine-type communications (mMTC) and ultra-reliable-
QoE Quality of Experience. low-latency-communications (URLLC) [11].
QoS Quality of Service. More specifically, the authors in [12], [13] summarize
QPSK Quadrature Phase Shift Keying. the key components and innovations incorporated in 5G
RA Rate Adaptive. networks, as: a) Modern approaches in radio-link manage-
RAN Radio Access Network. ment, such as open radio access network (O-RAN) and vir-
RAT Radio Access Technology. tual networks, in order to meet the strict criteria of latency,
RB Resource Block. capacity and data traffic in 5G transmission, b) Extended
RF Radio Frequency. coverage, which includes the installation of multi-nodes and
RL Reinforcement Learning. multi-antennas in the network’s coverage area, in order to
RMSE Root Mean Square Error. use multi-hop techniques for fast handovers through ser-
RN Relay Nodes. vice cells and base stations (BSs), c) Service-based network
RRM Radio Resource Management. dimensioning, which utilizes the self-generated channel state

83508 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

information (CSI), in order to meet the enhanced URLLC ML algorithm in each RRM sub-problem and highlight open
criteria. Cell and BS planning should follow even stricter issues, limitations and potential solutions. This is achieved
requirements to support new usage scenarios and applica- through a state-of-the-art analysis of the existing literature,
tions (smart cities, IoT, emergency alerts). Thus, heuristic focusing on the performance of each proposed ML-algorithm
approaches, based on data analysis and machine learning with respect to various networks’ key performance indicators
(ML), are proposed in network dimensioning [14] and d) (KPIs). Moreover, in an effort to determine the capabilities
Use of new frequency bands, which includes the extended that ML methods bring to RRM-related tasks, the problem of
operating spectrum band and the new spectrum regimes [15]. throughput prediction is investigated, as an indicative case of
In addition, 5G and B5G networks extend the deploy- ML utilization in 5G/B5G networks, by comparing various
ment of technologies that were introduced in fourth- ML algorithms.
generation (4G) networks and also encapsulate new ones
(see also Fig. 1). These include massive multiple-input- B. RELATED SURVEYS—PAPER OVERVIEW
multiple-output (m-MIMO) configurations [16], millimeter The emerging need for efficient RRM through ML, that is
wave (mmWave) transmission [17], network slicing [18], presented in the previous sub-section, has motivated many
relay nodes (RNs) [19] and non-orthogonal multiple access researchers over the last years. The studied surveys in this
(NOMA) [20]. However, the coexistence of these technolo- subsection, have focused on ML deployment for effective
gies can significantly increase network complexity, due to resource allocation strategies’ definition in 5G/B5G net-
the insertion of multiple computational levels and hardware works. Table 1 summarizes these surveys, presenting the key
needs, thus necessitating the importance of optimal radio problems and the corresponding contributions.
resource management (RRM) strategies [21]. For example, In [37], the authors considered ML, data analytics and nat-
accurate CSI is required for the effective deployment of ural language processing (NLP) in network planning and
m-MIMO architectures and NOMA schemes. This, in turn, management of 5G networks, with emphasis on RRM and
increases the overall signaling burden, due to the increased security issues. Moreover, a prediction of the channel impulse
number of pilot signals. Moreover, in typical MIMO con- response (CIR) problem was presented as an indicative
figurations, each antenna is connected to a separate radio use case. The authors in [38] presented a state-of-the-art
frequency (RF) chain, thus supporting a fully digital (FD) approach in energy-aware 5G systems. In this framework,
beamforming approach. However, in an m-MIMO case, this ML-based solutions were investigated in practical Third Gen-
would be prohibitive, as it would significantly increase hard- eration Partnership Project (3GPP) new radio (NR) features,
ware complexity. Hence, suboptimal techniques are proposed in order to maximize energy efficiency (EE). According to
in the literature, based on a hybrid analog-digital beamform- the presented analysis, reinforcement learning (RL) tech-
ing approach [22]. niques are more suitable in environments with multiple con-
It is, therefore, understood, that a tradeoff between optimal straints.The significance of deploying Green AI techniques,
network goals and computational complexity can only be in order to reduce power consumption in wireless networks,
achieved through an efficient RRM. Until now, the allocation is highlighted, as well. In [39], the authors focused on the
decisions were made continuously in each timeslot, based significance of RRM through ML in the development of
on local network conditions and the data traffic load to be sixth-generation (6G) networks. In this context, the exten-
serviced. However, the aforementioned enhanced require- sive usage of mobile devices and the dynamic changes in
ments of 5G networks raise the need for, if not require, CSI and data traffic formulate a multi-dimensional quality
a decentralized and intelligent data management system, that of service (QoS) problem. Therefore, the authors suggested
can support flexible RRM decisions. In this direction, the that power allocation and channel modeling should become
utilization of data offered by ML and the features extracted data-driven through ML models. In addition, they proposed
by the corresponding algorithms can effectively contribute to that the existing ML schemes should consider data reduc-
fast RRM responses [23], [24]. tion methods in the training phase, as networks’ datasets are
The current research interest in incorporating ML tech- characterized by large amounts of data and features. Finally,
niques in 5G networks is mainly focused on the core network the significance of a trade-off between supported services
(CN) [25]–[27] (indicatively: traffic forecasting [28], [29], (e.g., augmented and virtual reality (AR/VR) for 6G net-
network slicing [30], privacy and security [31], etc.). Lately, works) and the strict requirements for latency, power, privacy,
ML models are introduced in RAN and the development of security and QoS, is highlighted, as well.
artificial intelligence (AI) and ML-based RRM algorithms The authors in [40] present an overview of the existing
has attracted scientific interest, as well (e.g. [32]–[35]; an RRM techniques in 5G/B5G networks. In this context, the
exhaustive analysis of ML-based schemes in 5G and B5G utilization of game theory, heuristic mechanisms and ML are
RAN, focusing on RRM, is presented in section III). presented, along with all the related constraints (e.g., latency,
The scope of this survey paper is to summarize recent QoS, EE). The main conclusion is that deep learning (DL)
works in the field of ML-based RRM, categorize them based and RL approaches can accelerate the performance of 5G
on the implemented ML technique, and thus provide guide- and B5G networks, due to their ability to quickly learn and
lines to researchers for selecting the suitable category of cooperate with all the elements of the network’s environment.

VOLUME 10, 2022 83509


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

FIGURE 1. 5G/B5G networks enablers.

In the same context, the authors in [41] focus specifically issues in a distributed ML environment are analyzed along
on DL techniques. Thus, deep neural networks (DNNs) and with ML frameworks, based on up-to-date literature. The
convolutional NNs (CNNs) are investigated in the scope of authors highlighted the need to improve computing hard-
resource allocation, security and channel estimation [27]. ware, cloud and edge servers in order to secure the effi-
According to the conclusions drawn, there is an upcoming cient performance of algorithms, with respect to strict latency
trend towards using DL in wireless networks, since B5G requirements, which occur in distributed computing wireless
and 6G networks will integrate higher levels of intelligence networks.
through ML, in order to support reconfigurable technolo- The authors in [31] introduce AI/ML as a set of techniques
gies, such as terahertz communications and unmanned aerial that can upgrade the performance of wireless networks, inte-
vehicles (UAVs). Similarly, the authors in [42] investigated grate new usage scenarios and enable emerging technolo-
ML utilization in different computing scenarios, such as 5G, gies. In this framework, an overview concerning ML-based
IoT, edge, fog, cloud and vehicular fog computing. Even solutions in physical layer aspects, channel modeling and
though supervised, semi-supervised and unsupervised learn- measurements, network management and application layer,
ing approaches are presented, the authors focus on DL ones is provided. The authors conclude that the integration of
providing a related taxonomy. They concluded that Deep RL AI/ML is still at an early stage, and standardization progress
approaches have the most efficient results in the resource allo- should be further accelerated.
cation problem. However, data quality and hyperparameter The above-presented surveys describe some aspects of
tuning considerations are raised in order for better ML models the current usage of AI/ML techniques in the resource
to be implemented. Moreover, DL integration in AI-enabled management procedures of modern era wireless networks.
ORAN architectures is investigated in [43]. The authors com- Table 1 summarizes these surveys and their contributions.
pared a DL solution based on edge support, virtualization The first column states the area(s) of interest for each survey,
control and management, as well as energy consumption. Fur- the second column gives the specific RRM-related problems
thermore, DL use cases and implementations are presented, that are analyzed in each survey, and the last column presents
leading to high-performance learning models. Finally, open contributions and suggestions that each survey provides. Our
issues on privacy and security, network slicing and energy work is included, as well. However, the above-presented
consumption are analyzed, as well. approaches, as it is also visible from Table 1, either focus
The authors in [44] presented an examination of dis- on ML integration in multiple Open Systems Interconnection
tributed AI/ML approaches in next generation commu- model layers [31], [37], [39] or on a specific category of ML
nication networks. More specifically, overhead reduction, algorithms for RRM [40]–[42], [44] or on a specific RRM
resource distribution enhancement, privacy and security related sub-problem [38], [43]. Our motivation is to extend

83510 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

TABLE 1. Presentation of surveys in AI/ML for RRM.

these works and focus on all ML categories analyzing their Furthermore, the coexistence of MEC and distributed
impact and usability in a plethora of cellular networks’ RRM learning techniques is analyzed, as it can tackle various
sub-problems. challenges, especially concerning the training time of
The key contribution of this paper is two-fold: ML models.
1) To present a state-of-the-art summary concerning 2) Through the above procedure, representative conclu-
ML-based RRM approaches. In this context, our inter- sions are drawn, as far as which ML models are appro-
est is mainly focused on the categorization of the priate in each RRM related sub-problem, based on
ML-based RMM schemes proposed in the litera- the network orientation. Moreover, limitations in cur-
ture, in terms of the type of learning, and, thus, rent research efforts, open issues and discussion over
on defining the optimal ML solution in various RRM the state-of-the art approaches are highlighted in an
sub-problems (KPIs prediction, user, subcarrier and effort to both present potential solutions in these con-
power allocation, etc.), with respect to different net- siderations and motivate future work on these fields.
work metrics (i.e., QoS, quality of experience (QoE), Thus, guidelines and research frameworks are proposed
throughput, etc.). In order to achieve this, first, the regarding AI/ML utilization for efficient resource allo-
general RRM problem is formulated, while significant cation in 5G/B5G networks.
non-ML approaches and their limitations are high- Finally, in order to highlight the significance of AI/ML
lighted, as well. Then, the state-of-the-art concerning implementation in RRM, the problem of throughput pre-
ML-based approaches in 5G/B5G RRM is presented. diction is investigated, as an indicative RRM task, treated
As already mentioned, these approaches are catego- either as a classification or a regression problem. Various
rized by the type of ML models used by each one ML algorithms are considered, results are presented, and
of them (Supervised, Unsupervised, Reinforcement). performance is evaluated, based on selected ML KPIs for

VOLUME 10, 2022 83511


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

considers not only the allocation of physical resource blocks


(PRB’s) or subcarriers (typical subcarrier spacing is 600 kHz
in frequency range 1 (FR1) of 5G and 2400 kHz in FR2)
[15], [45], but also power management, scheduling, traffic
control and handover management.
In general, RRM considers two main objectives, that in
case can be treated as joint. The first one is power minimiza-
tion, which is referred to as margin-adaptive (MA), while the
latter is network efficiency maximization. In this framework
throughput (or rate) maximization (rate adaptive - RA) is
mainly considered. MA minimization considers overall and
per user minimization of power consumption. Respectively,
RA maximization takes into account overall and per user
minimum throughput maximization [46]. Both approaches
include a plethora of parameters, at cases mutually exclu-
sive, that can significantly increase the complexity of RRM.
In fact, in [45], the non-deterministic polynomial-time (NP)-
hardness of the resource allocation problem is proved.
Consequently, sub-optimal solutions are proposed.
In the 4G-LTE era, when OFDMA techniques were intro-
duced, RRM algorithms mostly considered the maximization
of users’ throughput, based on QoS requirements, such as
the key implementation criterion. The main categorization
FIGURE 2. Paper structure. was the stage at which RRM was performed, considering
sectors or BSs, with centralized or decentralized approaches.
An innovative solution was introduced by game theory, where
each task. Finally, through the above-described analysis, lim- the RRM problem was treated as a game and each user as
itations and open issues are witnessed and potential solutions a player. Techniques such as Nash bargain (NBS), Hungar-
are described. ian NBS and Raiffa bargain (RBS) were the most common
The rest of this manuscript is organized as follows (see ones [47].
also Fig. 2): In Section II, the joint user, subcarrier and In a typical 5G MIMO cellular orientation, the total band-
power allocation RRM problem is formulated, with respect to width, denoted as W, is divided into a predefined number
the corresponding constraints. Moreover, significant non-ML of L subcarriers, which are allocated to users, according to
RRM works are presented and their limitations are high- their demands and overall constraints [48]. The system serves
lighted. In Section III, the different types of ML are analyzed. as many users as possible, till all subcarriers are allocated
In the same section, a state-of-the-art presentation, concern- (N users). BSs are equipped with Mt transmitting anten-
ing AI/ML algorithms in 5G/B5G systems’ RRM, is per- nas, while users are equipped with Mr receiving ones. The
formed. The ML-based solutions are categorized by the type signal-to-noise-plus-interference-ratio (SNIR) for the nth user
of learning. Furthermore, the joint employment of ML and (1 ≤ n ≤ N ), associated with the lth subcarrier (1 ≤ l ≤ L)
distributed technologies (such as MEC) is presented and pro- for a specific channel realization and assuming independently
posed as an efficient way to tackle the existing limitations. transmitted streams among different users, is defined as fol-
In Section IV, the employed ML algorithms for throughput lows [49]:
prediction are presented, as well as the performance com- Gn,n,l
parison among them. In Section V, open issues in the field SNIRn,l = H P (1)
rn,l rn,l I0 + m6=n,l∈Sm Gn,m,l
of AI/ML in RRM are stated and suggestions for future
works are drawn. Finally, concluding remarks are provided where Gn,m,l = pn,l tH H H
m,l Hn,sec(m),l rn,l rn,l Hn,sec(m),l tm,l ,
in Section VI. Hn,sec(n),l represents the Mr × Mt channel matrix for the
lth subcarrier of the nth user relevant to its serving sector,
II. RRM IN 5G NETWORKS tn,l is the Mt × 1 transmission vector, assuming diversity
A. PROBLEM FORMULATION AND CONSTRAINTS combining transmission mode, rn,l is the the Maximal Ratio
Even though RRM problem’s criticality originates from the Combing multiplying vector and pn,l denotes the transmission
first steps of wireless and mobile communications, the signif- power allocated to the lth subcarrier of the nth user. Moreover,
icance of effectively managing the available radio resources the set Sn indicates the subcarriers allocated to the nth user
was empowered during the 4G era, when the increase of and I0 is the thermal noise level. Finally, AH denotes the
data rates was accompanied by the high interference levels conjugate transpose of matrix A. Thus, the achievable data
(especially co-channel). In the 4G, 5G and 6G era, RRM rate on the lth subcarrier is rn,l ← W · log2 (SNIRn,l ) [50],

83512 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

and thePcorresponding aggregate rate for the nth user is


Rn ← N s∈Sn rn,s . Then, the total throughput is given by:

N
X
R= rn,s (2)
n=1

In most of the state-of-the-art RRM works, the target is to


maximize EE, SE, Jain’s fairness index (J) and, at the same
time, minimize blocking probability. EE and SE are given by:
R
EE = PN P (3)
pn,s FIGURE 3. Relationship between QoS and QoE [54].
n=1 s∈Sn
R
SE = (4)
W A key aspect in resource management policies in 5G net-
Moreover, J index is defined as: works is the harmonization with both QoS and QoE require-
ments. While QoS defines the user’s satisfaction in a strict
( N 2
P P
n=1 s∈Sn rn,s ) technical way, QoE reflects the overall user’s happiness or
J= PN P (5)
N · n=1 s∈Sn rn,s 2 frustration. The relationship between QoS and QoE is pre-
sented in Fig. 3. According to [55], there are two main (and
Finally, blocking probability (BP) is defined as the ratio of one upcoming) ways to achieve the optimal joint satisfac-
rejected users to the total number of used that tried to access tion of QoS and QoE. The first one refers to the network’s
the network. architecture and is the use of self-organized networks (SONs).
The aforementioned optimization problem is subject to the The other one refers to the efficient tradeoff between packet
following system constraints: loss, latency, traffic data (objective parameters) and mean
P
• s∈Sn pn,l ≤ pmax , where pmax denotes the maximum opinion score (MOS), that should always exist. Last but not
power limit per user. least, the integration of ML techniques in RRM, specifically
• pn,l ≥ 0, 1 ≤ n ≤ N , 1 ≤ l ≤ L, which demonstrates NNs, which use data-driven (CSI-driven) techniques, in order
the non-negative power constrain of the transmit power to solve the optimization problem, can contribute in the
on each subchannel direction of joint QoS and QoE requirements’ satisfaction.
• SNIRn,l ≥ SNIRthr , which sets the minimum SNIR These techniques will be deeply analyzed in the upcoming
threshold for acceptable QoS. sections III and IV.
• Nl,t ≤ Nthr , 1 ≤ l ≤ L, 1 ≤ t ≤ T , where Nl,t is In the existing literature, the significance of both QoS and
the number of users, grouped in the lthr subcarrier over QoE requirements’ satisfaction is highlighted. For example,
time slot t, and Nthr is its upper threshold, in the case of the authors in [56] consider the resource allocation problem in
NOMA transmission [51]. M2M 5G 3GPP cellular systems. An optimal radio resource
allocation method in LTE and beyond cellular networks is
B. REPRESENTATIVE RECENT NON-ML APPROACHES developed, based on adaptive selection of channel bandwidth,
In this section, we summarize significant up-to-date depending on the QoS requirements and priority traffic aggre-
approaches, which tackle the RRM multi-objective problem gation. Furthermore, a novel simulator is proposed, focusing
and do not make use of ML techniques (defined as ‘‘non- on the joint impact of M2M and human-to-human (H2H) traf-
ML’’ throughout the rest of the manuscript). The relevant fic in 5G networks. In order to ensure the satisfaction of QoS
literature in this sub-section is representative with respect to requirements, the proposed simulator automates RRM algo-
various network metrics, such as throughput, QoS, interfer- rithms for both the M2M and H2H traffic. The simulations
ence mitigation. and results indicate that the proposed framework improves
In [52], a resource allocation scheme is proposed, where the radio resource management policies’ application by 13%,
target SNIR values are accompanied by the minimization of concerning the LTE frame formation process.
power consumption. In the same context, in [53], the available Wang et al. [57] use QoE utility function for spectrum
spectrum is shared between macro and micro cells to maxi- and power allocation in macro and pico–cell HetNets. For
mize the number of users and achieve the SNIR requirements the subcarrier allocation method, they construct a weighted
of each micro or macro cell user. In [54], a different approach bipartite graph and revise Kuhn-Munkres algorithm to obtain
is considered, where the distance-based resource allocation perfect matching. For power allocation, they use the first-
scheme is replaced by a model, based on priority classes of order derivative of the network utility function, achieving
the mobile devices in mobile type communications (MTC) the nearly-optimal levels of power minimization. However,
networks. This approach, apart from SNIR, considers latency, increasing the cell size results in QoE deterioration. In the
total induced delay and pending number of MTC devices, same framework of using QoE utility function, the authors
as well, for priority classes construction. in [58] consider the joint subcarrier, assignment and power

VOLUME 10, 2022 83513


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

allocation problem. The proposed approach is based on the and an iterative water-filling algorithm. According to the
decomposition of the general problem into two sub-problems: presented results, there are significant throughput improve-
the BS selection and subcarrier allocation sub-problem and ments, compared to classic RRM schemes. On the other hand,
the power allocation sub-problem. A genetic algorithm for the computational complexity is extremely increased, reaching
first problem and an artificial bee colony (ABC) algorithm for almost prohibitive levels.
the second one are proposed. The simulation results indicate In [64], a similar joint routing and resource allocation prob-
that the proposed power allocation scheme reaches optimal lem is investigated, considering multi-tier analysis approach
solution levels quickly, while MOS increases for increasing for mmWave systems. Resource allocation concerns the phys-
number of active UEs or available subcarriers. ical layer, while path selection concerns the network layer.
In 5G HetNets, interference can have a critical impact A stochastic algorithm is used for RRM and a linear pro-
on the selection of the appropriate RRM strategy. There gramming one for the path selection. The EE and the overall
are three types of interference. The first one is cross-tier system throughput are significantly improved, compared to
interference, which occurs between users in different tiers, state-of-the-art algorithms. However, a lot of delay factors
such as between macrocells and fempto-cells (FCs). On the are inserted, due to the adopted cross-layer approach. There-
other hand, co-tier interference is experienced by users within fore, this scheme might be inappropriate, when dealing with
the same network tier [59]. Finally, inter-cell interference URLLC demands in emergency situations.
occurs mainly at the cell edges, where a user can receive Another significant metric that originates from throughput
signals from multiple BSs/RNs. The authors in [60] con- is SE, which is the ‘‘clear’’ information that can be trans-
sider a 3-tier HetNet and propose a joint interference and mitted over a specific spectrum area in a wireless environ-
resource allocation strategy. The examined use cases enhance ment. In this context, the authors in [65] propose a resource
D2D communications in macro and small cells topology. allocation system, based on SE requirements. They make use
The joint sub-band and resource block (RB) allocation prob- of a hybrid-clustering game algorithm, that mitigates co-tier
lem is solved, with respect to the QoS levels and D2D and cross-tier interferences. The clustering problem is solved
interference minimization. The proposed scheme alleviates using graph theory, and more specifically a maximum K-cut
significantly co-tier and cross-tier interference, compared algorithm in the interference graph of the topology. Then,
to traditional techniques. On the other hand, the proposed inside each cluster, resources are allocated to users, imple-
algorithm introduces delays that could cause difficulties in menting an auction game mechanism algorithm. According
the deployment of the scheme in real-world scenarios. In the to the presented results, there are significant improvements,
same context, authors in [61] examine the influence of inter- compared to state-of-the art approaches, in terms of SE and
cell interference in the design of effective RRM strategies. throughput. However, we should mention that, by the above
More specifically, they formulated an EE maximization RRM scheme, both macro and micro-cell users are treated as one
problem for a downlink OFDMA HetNet, and solved it via entity. In this case, the QoS and QoE metrics are not taken
a two-step generic algorithm. The first step concerned sub- into consideration.
carrier allocation under SE requirements, while the latter In ultra-dense modern era networks, power consumption
power management. Simulation results indicated that a trade- becomes a key issue. Thus, the metric of EE is used to
off between EE and total achieved throughput should exist, measure the power consumption in the topology [66]. In this
proposing small cell deployment as a way to simultaneously context, a complex scheme is proposed in [67], that jointly
improve both factors. maximizes EE and SE. There are three different components
Xu et al. propose in [62] a resource allocation scheme in the proposed scheme. The first one is a system to bal-
to maximize the system throughput, by considering cross- ance the load between the BS of service and other BSs in
tier and co-tier interference for macrocell users, as well as the topology, along with handover management. The second
the transmission power in HetNets. The proposed scheme one aims to manage inter and intra-cell interference and
uses a nonlinear programming formula, solved by distributed frequency reuse. Finally, the third one applies a proportional
Lagrange dual methods. This method results in interference fairness (PF) allocation policy to guarantee fairness among
limitation for the users spread in the topology. However, the users. A binary search algorithm implements the resource
adopted approach involves many iterations, thus leading to allocation, maximizing EE and SE. Therefore, this approach
increased overall delays. is beneficiary for commercial use cases, due to the fast
In [63], a joint RRM problem is investigated and solved decision-making mechanism, leading to optimal solutions.
sequentially in an mmWave environment. The first one is However, the fully centralized nature of the algorithm might
related to beam selection (beamforming), while the second increase overhead, due to the increased round-trip time.
one to power allocation. These problems are formulated into Another key issue in future networks is the limitation
mixed integer nonlinear programming (MINLP) problems. of usable resources to tackle the spectrum scarcity prob-
The authors solve the first problem using cooperative games lem. Dynamic spectrum sharing is proposed as a novel
theory. In this way, optimal beam allocation is achieved and method for the cooperation between 4G-LTE and 5G tech-
served as input to the second problem, where the power nologies, as different spectrum resources can be allocated,
allocation scheme is determined, employing Lagrange duality based on users demands, establishing improved SE levels and

83514 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

spectrum utilization. The authors in [68] proposed a dual mobility patterns. In these scenarios, the mathematical
bargaining game model to solve the spectrum sharing prob- formulation of the problem is arduous and, in general,
lem guarantee effective real-time collaboration between LTE not easily defined [67].
and 5G systems. Results indicated that this scheme improves According to these considerations, more efficient RRM
total throughput and service failure by 5-10% compared to solutions should be implemented in both computational
traditional approaches. and performance perspective. In this framework, ML-based
Furthermore, the increased number of traffic load from resource allocation algorithms are proposed in the literature,
mobile devices, which causes the densification of wireless as an efficient way to deal with the abovementioned lim-
networks, empowered the deployment of revolutionary cen- itations. In the following paragraphs, after introducing the
tralized alternatives of the classical cellular architectures, different types of ML, we present the state of research in the
such as Cloud RAN (CRAN) and O-RAN. In CRAN architec- field and draw guidelines and considerations for future work.
tures the baseband processing unit (BBU) is moved from the
BSs onto a centralized cloud/edge BBU pool, while O-RAN III. ML ALGORITHMS IN 5G/B5G SYSTEMS FOR
indents to provide open air interfaces and separate user and RRM OPTIMIZATION
control plane functions. The authors in [69] proposed a A promising direction to tackle the challenges we highlighted
two-stage optimization algorithm for the joint secondary user in the previous sections is the deployment of ML [72], [73]
selection, spectrum allocation and time scheduling problem in order to formulate a data-driven framework in wireless
of downlink transmission in CRAN. Results indicated that communications’ RRM. AI/ML technologies are and will be
improved data rates, time scheduling and prioritization for big used extensively in the 5G/B5G communications era, both
data transmissions can be achieved using the above scheme. in the CN and the RAN part of the 5G (6G) environment.
Concerning O-RAN, the authors in [70] implemented a In this direction, network slicing and traffic management, that
mixed-integer linear algorithm to solve the joint distributed enable improved network performance and reliability, are two
unit and subcarrier allocation problem, with respect to energy representative problem cases of AI-assisted solutions [23].
and latency minimization for delay-sensitive communica- However, the reported research in the field has mainly
tions. Results indicate that the proposed approach consumes focused on the CN, in order to deal with the routing prob-
less energy under a larger network size, compared to a dis- lem or to propose efficient network slicing implementations.
joined scheme. In general, less research efforts are reported on traffic control
or RAN. Moreover, for traffic control, until now the reported
C. LIMITATIONS OF NON-ML APPROACHES research has only focused on the network layer, with only a
In the previous sub-section, significant non-ML approaches, few research reports on the application of AI technologies to
concerning RRM in 5G and B5G networks, are presented, the physical, application or semantic layer.
where various sub-optimal solutions are proposed, due to the In the following subparagraphs, the related research con-
multiparameter nature of the problem. However, focusing on cerning the use of AI/ML in RRM is presented, classified
the outcomes and results of those research efforts, several in terms of type of learning and architecture (centralized
limitations can be witnessed. In most cases of LTE and early vs distributed). The performance of the used models is also
5G networks [56], [58], [59], [64], the enactment of the RRM discussed, and conclusions are drawn upon them.
policy was based on perfect knowledge of specific param- Finally, in order to present and discuss the existing litera-
eters, such as the instantaneous CSI and QoS requirements ture concerning the use of ML in resource allocation in 5G
of the active users. Thus, the optimal allocation problem, and B5G networks, we first introduce in sub-section III.A the
described in the above paragraphs, is solved through opti- classification of ML algorithms, in terms of the type of data
mization procedures. However, it is also apparent from the they process (labeled or unlabeled), as well as in terms of the
problem formulation that, in practical wireless orientations, corresponding mechanisms (see also Fig. 4).
multiple difficulties may arise, thus making resource alloca-
tion a multidimensional problem. More specifically: A. TYPES OF MACHINE LEARNING
• Most of the non-ML techniques provide solutions which Supervised learning is based on a dataset with values
are not universal. Optimal solutions are highly correlated accompanied by their respective labels. These labels can be
to the current circumstances in each network’s topology, produced either by humans or automatically by computa-
user demands and qualifications. Thus, RRM, in general, tion [23]. A common practice to deal with the dataset is to
is a problem characterized by non-conventionality [71]. split it in a training and a test set, where the first one is used
• The provided solutions may not be obtainable in real for model training. In other words, a mapping between the
time. HetNets and IoT networks have high levels of time inputs and the labels is being produced. The most indicative
variability. An optimal solution in a time slot or interval use cases of supervised learning are classification or regres-
is not by default optimal for the next time unit [63], [64]. sion problems. The latter term refers to the prediction of a
• The wireless channel in 5G and B5G networks is target numerical value, given a set of features/attributes, also
defined by an extremely high propagation scheme, with called predictors, through an estimation function. In linear
users characterized by random or partially unknown regression the estimation function is linear, while in logistic

VOLUME 10, 2022 83515


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

regression it is a common sigmoid. Classification refers to


the prediction of a class label, by using classified example
data as input. The basic difference, compared to regression
techniques, is that the model displays the probability that
a certain value belongs to a given class [73]. The system
is trained by multiple examples of a class, along with their
labels, in order to learn how to classify new instances. The
ML techniques/algorithms, that are mostly used in RRM-
related problems, are briefly presented below and will be
reported again in section IV, where the corresponding liter-
ature is analyzed in detail.
A k-NN algorithm classifies instances by comparing its k
nearest neighbor’s labels. Then, the item is classified to the
most common of them [74], [75]. On the other hand, Support
Vector Machines (SVMs) are used for both classification and
regression. Data are plotted as a point in an n-dimensional
space, where n is the number of features of the dataset, and
classified by finding the hyper-plan, which differentiates the
problem’s classes in an optimal way [76]. Decision trees
can be used, either for regression or classification purposes.
However, traditional decision trees approaches record high
variance levels, due to their sensitivity to training data. Aim-
ing to prevent this problem, alternative approaches are imple-
mented. For instance, bagging trees classifiers use bootstrap
simulations to generate reliable results [77]. A major cate-
gory of supervised learning techniques is the artificial neural
networks (ANNs). These learning algorithms are inspired by
brain, in order to simulate, predict or store information. Their
basic building units are neurons and the connections between
them, which formulate the model. ANNs are used both in
regression and classification problems.
Furthermore, overfitting/underfitting should be checked at
each time a model is formed, in order to prevent inserting FIGURE 4. Types of learning: (a) Supervised learning, (b) Unsupervised
errors, making it unable to depict properly all the attributes learning, (c) Reinforcement learning.
of the tested dataset. Underfitting occurs when the model is
not able to obtain a low error on the training set [78]. This
means that the model cannot describe all the characteristics the agent creates a policy to set up its own learning scheme
in the dataset. On the other hand, overfitting takes place, when and decide which actions to choose in a certain situation. The
a significant difference between the errors in training and aim of the RL task is to maximize the reward over time [78].
implementation (training set vis a vis test set) is detected [79].
This means that the model describes more characteristics, B. SUPERVISED LEARNING
than the actual ones. The authors in [80] consider a SON topology. A 5G network
Unsupervised Learning differs from supervised learning simulator is proposed, along with a pathloss model, using
(see Fig. 4b), as the model itself tries to identify the common metrics, such as SNIR and throughput (LTE KPIs) in order
characteristics of the dataset [23], [79]. Moreover, labels are to deal with the problem of dynamic frequency and band-
not included in the dataset, as the system tries to find them width allocation in these topologies. The system is tested in
without external help. However, the concept of training and several frequencies and bandwidths. In order to set the RRM
test data remains the same. We only insert, as input in an policy and predict the KPIs, several ML methods, such as
unsupervised model, the number of clusters or characteristics bagging trees, boosted trees, SVMs and linear regressors are
to be mined. By the term cluster, we refer to the number of evaluated. Bagging tree prediction witnesses the best overall
distinctive groups, in which the dataset is classified. performance. The main feature of this method is that it uses
Finally, Reinforcement Learning uses a learning entity, bootstrap sampling in deep decision trees, in order to reduce
often called agent. Agents act as representatives of the sys- the variance of the model and classify data correctly to predict
tem, for its collaboration with the environment. The infor- the network’s KPIs. According to the derived results, the
mation feedback that the agent returns to the model is called decision tree learning-based method reaches 95% of optimal
rewards (positive case) or penalty (negative case). In that way, network’s performance. Finally, the authors highlight the

83516 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

necessity for a joint consideration of networks’ KPIs and ML the presented results, the EE of the system is significantly
performance metrics. improved, while the resource allocation reaches optimal lev-
Working also on KPIs prediction, the authors in [81] design els (98% accuracy).
a predictive model for the overall users’ demand. Then, they Guerra-Gómez et al. [85] propose a dynamic resource
use an ML-based supervised classifier to allocate the network management scheme, based on the prediction of the total
resources dynamically (Network Resource Allocator). The system’s capacity. They use three different ML algorithms:
employed metrics are bandwidth, latency, jitter times, QoS SVM, DNN, and LSTM. According to the presented results,
and QoE. The decision process for data traffic and allocated the scheme can perfectly reduce the underutilized resources;
subcarriers is defined by QoS and QoE. The learning proce- however, QoS levels are not optimized. Therefore, the authors
dure is based on previously gathered experience from offline propose two novel strategies. The first one considers data
measurements. Thus, the proposed Network Resource Allo- pre-filtering and results in an additional 2% minimization of
cator empowers an automated flexible and elastic network. unallocated resources. The latter one considers error shift-
The models are employed in the network’s controller in order ing and leads to an additional 3% reduction in unallocated
to change the network topology for better traffic management resources. However, the achieved QoS levels form a barrier
by removing the unused parts of the network to release its in this approach.
unused resources (i.e., subcarriers, unused links, etc.). The authors in [86] consider the problem of optimal and
In m-MIMO systems, hundreds of antennas are used automatic BS selection in LTE and 5G environments. They
for detection, resources’ allocation and channel estimation propose two ML-based classification solutions to satisfy QoS
(via channel coefficient matrix). In [82], an SVM scheme is requirements; the first one uses SVMs and the second one
proposed for the estimation of the Gaussian channel’s noise Random Forest. Both approaches are compared to a non-ML
level and pathloss prediction in urban outdoor environments. BS selection approach. The results indicate that the ML-based
The general form of the problem has t transmitting MIMO BS selections can improve throughput and decrease outage
antennas and r receiving ones. The model predicts the chan- probability and delay. Specifically for a 50-user topology,
nel noise statistics, according to which the allocation and ML approaches achieve 23.21% higher throughput levels,
multi-tier QoS scheme will act for each independent user 70% lower packet loss ratio and 48% lower delays compared
or users’ category. Three kernel techniques are investigated with a non-ML approach.
(Polynomial, Gaussian and Laplacian) and compared to the In the same framework, Butt et al. [87] investigate the UE
Okumura-Hatta pathloss model and an ML-based ANN one. positioning problem in 5G networks. The authors compare
Laplacian SVM witnesses the best performance, in respect a decision tree classifier with two DNN solutions. The first
to both pathloss prediction and computational complexity. one uses training data from the service cell and overperforms
The overall satisfactory performance of the SVM approach in terms of accuracy, while the second one uses transformed
is due to the use of multi-dimensional representations in data from the cell and its neighboring ones. In general, the
feature extraction, leading, thus, to reduced training time and DNN solutions witness an overall near-optimal performance,
increased capacity. ANNs’ performance is similar to SVMs’ in terms of accurate positioning of UEs. In fact, the 2-hidden
approach, needing though longer training times, as multiple layer DNN witnessed a positioning error in the range of
initializations are requested. 1-1.5 m, after appropriate feature selection.
Considering DL approaches, Liu propose in [83] an ANN
algorithm for channel learning, to mine undiscovered chan- C. UNSUPERVISED LEARNING
nel information data from a 5G network. They use location Song et al. [88] produce a realistic 5G V2V networks’ simula-
features and CSI and they produce channel samples from 5G tor, with the presence of RNs. A k-Means clustering algorithm
simulators, that are latter used as training data for the model. is responsible for implementing BS or RN selection, user
The channel ANN estimation algorithm calculates unseen allocation and serving policy. User positioning and RN dis-
aspects of the channel approximation and resource allocation tribution in the topology are performed via ML, in a way that
scheme. The prediction accuracy improves, compared to tra- the serving device, BS or RN, is optimally selected. However,
ditional k-NN classifiers. It remains, though, limited to a level the model calculates every 2D distance from the observation
of 75%, but could be further increased by approximately 3%, point (in that case UE) to the borders of each cluster and not to
if geographical information is used in the dataset. the cluster center. Thus, the overall communication environ-
Zhang et al. [84] build a deep NN (DNN)-based framework ment parameters are not taken into consideration. Moreover,
for user, subchannel and power control in NOMA mmWave since the proposed k-NN algorithm is a generic unsupervised
networks. The solution of the user association problem is ML method for clustering, its performance can be affected,
given by the Lagrange dual decomposition. The subchannel if UEs have a complex spatial distribution or clustering is per-
and PRB allocation is given by a semi-supervised learning formed in different topologies. However, the authors intend to
algorithm, while the power allocation is given by a DNN further improve and configure the algorithm, to define a more
model. The use of the described joint ML-based compo- efficient selection strategy.
nent (for user, subcarrier and power control) delimits the The authors in [89] propose a data-based resource alloca-
entire decision-making policy in terms of RRM. According to tion scheme, where an ML technique of affinity propagation

VOLUME 10, 2022 83517


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

is used. In general, this approach uses graph theory to perform power of the macro-BS or RN. According to the pre-
clustering. The basic advantage of the proposed algorithm is sented results, the proposed approach reaches optimal levels
that it does not require the number of the clusters as input. of users’ satisfaction, based on achieved throughput com-
In this way, knowledge and behavior extraction can be made pared to traditional water-filling [96] and weighted minimum
even under complex scenarios. The authors conclude that the mean squared error (WMMSE) approaches [97]. However,
data-driven nature of the RRM policy improvs both system’s as expected, the difference between user demands and allo-
EE and throughput, although, in some cases, the QoS levels cated throughput is increased, as the user requirements do so.
are not the desired ones. The authors in [98] propose a distributed multi-agent deep
Wang et al. propose in [90] an asynchronous resource allo- RL (MARL) framework for joint user and power allocation,
cation scheme, based on aggregation graph NNs (Agg-GNN). in a dense wireless network. The data are generated by real
In this approach, every BS or RN aggregates information measurements and backhaul delays. The results, via simu-
from its active neighbors with a certain delay. Thus, both the lations in dense wireless networks, indicate that the scheme
underlying network structure and the system’s asynchrony achieves a tradeoff between sum-rate and 5th percentile rate,
are incorporated. According to the presented results, this similar to centralized scheduling algorithms. The authors
approach outperforms heuristic ones, in terms of the total intend to verify the performance of the RL scheme in real-
system’s capacity. The presented simulations, though, used world scenarios in the future.
only a small number of active UEs in the topology. Probably, The authors in [32] use QoS as the basic metric in an
in more complex environments, GNNs’ training time might ML-based resource allocation scheme. An RL (Q-learning)
increase, and, thus, performance might deteriorate. algorithm is used for the radio access technology (RAT),
In [91], the authors propose an integrated scheme for while the actual RRM is developed, employing the monte-
resource management in NOMA environments. The first carlo tree search (MCTS)-based Q-learning algorithm. The
stage of the algorithm refers to the users’ grouping and sub- authors prove that optimization is achieved after a reasonable
carrier allocation, while the latter one to the power control. number of searches and that it outperforms other schedul-
UEs are grouped via the k-Means method, while subcar- ing methods, with respect to the system throughput and
rier allocation and cluster definition are calculated using the resource utilization. However, the computational complexity
F-test method [92]. Power assignment is performed for the is increased, due to the exhaustive use of the MCTS method.
allocated subcarriers, by formulating a convex optimization This could be a disadvantage in real case scenarios.
problem. The presented results indicate that the proposed Moreover, RL methods are utilized [33] in order to mini-
approach reduces electromagnetic exposure and increases the mize the total transmission power in HetNets, while jointly
total served users. Although in this approach single antenna satisfying the bit rate requirements of different UEs. Every
configurations are used, both in the BSs and the UEs, the UE can be connected to one of the available BSs or to another
authors are aiming to extend their work to MIMO systems. UE, which acts as an RN. The authors use Q-learning in the
decision-making procedure. The proposed algorithm reaches
D. REINFORCEMENT LEARNING optimal levels, in terms of the resource allocation. In addition,
Alnwaimi et al. used RL in [93] to increase spectrum acces- the decentralized nature of the algorithm, constitutes a very
sibility in FCs. The proposed scheme identifies the available promising approach with future extensions, as it uses specific
spectrum opportunities; then, it selects subchannels, so that UEs as BS/RNs.
they operate avoiding intra/inter-tier interference and meet RL methods have been also used in 5G satellite commu-
certain QoS requirements. A key aspect of this approach is nications to efficiently perform RRM related tasks. More
that the considered method reaches optimal levels, in terms of specifically, the authors in [99] propose an intelligent RL
sub-carrier allocation, even in tiny cell topologies. The basic wireless channel allocation algorithm for 5G m-MIMO High
contribution of this approach is the reduced convergence time Amplitude Platform Station (HAPS) networks, based on
and the fast decision making procedure. However, these come Q-learning and back-propagation NNs. The entire network is
at the cost of reduced accuracy which is now limited to 75%. trained using the Q-learning model, while CSI information is
In [94], an RL-based algorithm chooses the frequency collected in the platform, through real-time agent interaction
channel and determines whether to change its location in the with the environment, and thus, updating the Q-algorithm
presence of jamming and strong interference. A Q-learning using a back-propagation NN. Results indicated that, even
algorithm determines the above decision, while a deep CNN if the number of agents is very high, the channel allocation
accelerates the channel feature extraction. The scheme oper- accuracy levels remain high (over 75%).
ates extremely well for huge channel numbers, in terms of
interference mitigation, and increases SNR levels compared E. DISTRIBUTED TECHNOLOGIES AND ML
to a simple Q-learning system (without CNN). As already pointed out in the previous sub-sections, an impor-
The authors in [95] propose a deep RL framework for tant bottleneck in 5G networks is data overload, in conjunc-
power control in 5G HetNets. The problem is formulated aim- tion with the limited storage and computational power of
ing to minimize the difference between the mobile users’ allo- UEs and BSs. A recently proposed solution is to use dis-
cated and requested throughput, by adjusting the transmitted tributed structures for processing reasons (Fig. 5). In wireless

83518 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

beamforming and caching issues can be jointly encoun-


tered. Related works in this field use DL models, such as
ANNs, for accurate computations. Such efforts are described
in [104] and [105], considering decentralized hybrid beam-
forming in 5G next generation node BSs (gNodeBs).
The proposed novel techniques (CNN frameworks in both
[104] and [105]) outperformed state-of-the-art optimization-
based and greedy-based algorithms, both in terms of SE and
computational complexity.
A synergy of MEC and ML is also achieved through
federated learning (FL). Counter to centralized ML meth-
ods, where local data (from UEs in 5G/B5G networks) are
uploaded to a centralized server, and also counter to classical
distributed approaches, where data are uniformly distributed
among the edge devices, FL schemes use local data to train
a global model, through multiple training iterations across
interconnected edge devices (UEs), in order to achieve the
desired global accuracy. Then, local updates, generated by
each interconnected device, are aggregated to a cloud or
FIGURE 5. MEC implementation.
a MEC server (in BSs) (Fig. 6). The required accuracy is
achieved by multiple communication rounds between the
server and the edge devices, which train the model with
networks, this is mostly achieved via MEC architectures, their local datasets. Thus, the total training time is a key
where cloud, edge and mobile processing cooperate [9]. MEC aspect in FL model design [106], [107]. The main reason, that
and ML are inextricably related concepts. MEC, being a dis- renders FL implementation an efficient method in distributed
tributed approach, uses ML tools in heterogenous topologies computation problems, is the privacy and security that is
(such as 5G and 6G networks) to obtain CSI till the network’s achieved through the local training of the model and the
edges, in order to define the resource allocation policy in each secure aggregation to the server entities. However, in RRM-
case. The goal of MEC is to minimize the computation time, related tasks, active UEs or edge devices have different
by allocating the traffic to different processing units. processing power, antenna characteristics and mobility pat-
In that case, as described in [100], the processing time over- terns, leading, thus, to heterogeneity in local datasets. More
performs the corresponding processing time without MEC. specifically, the data generated in each UE contain different
If user n sends a computation task j to a MEC device m, then labels and/or features and are not of the same volume. This
the total MEC latency is given by the transmission time of is called non-independent and identical distribution (non-
task j from user n to the processing unit m, plus the user IID) in the generated data [106]. Therefore, the purpose of
delay to process the task, plus the execution time in the MEC implementing FL schemes in RRM (i.e., resource allocation,
device [101]. latency minimization) is, also, to address the aforementioned
Focusing on MEC technologies in RRM, the authors heterogeneity and, in that way, improve the accuracy of the
in [101] present the state-of-the-art on the employment of global model [108].
MEC networks, focusing on architecture, cashing, compu- In this framework, the authors in [108] propose a UE
tation and use of ML-based schemes. In general, caching scheduling method in an FL-assisted wireless network, based
refers to the temporary storage of content (CSI in RRM- on the joint quality of channel and learning optimization.
related tasks) in centralized or decentralized databases, for When wireless resources are limited, this method improves
future access. The reasoning behind those storages is that the overall training time, compared to traditional ones.
an instance (i.e., a D2D communication in RRM), that has However, the model’s accuracy decreases in an environment
occurred once, is very likely to occur again in the future. with powerful resources, due to data overload.
In MEC systems, these techniques are commonly used for To deal with the problem of training latency in different
decision making and allocation of available resources. For topologies of the network, the authors in [109] consider joint
example, the authors in [102] reach a 10 – 11% lower optimization for user selection, frequency and transmit power
latency and improvements in QoE, compared to non-caching allocation, using the Majorize – Minimization algorithm and
schemes. The authors in [103] propose an efficient content phase shifting, by employing semidefinite relaxation and
caching policy for edge using dynamic ML predictions. The Gaussian randomization, to reduce the training time of the
proposed Long-Short-term Memory approach provided 30% FL wireless system.
higher caching ratio, than conventional approaches. Concerning distributed computation and MEC employ-
MEC and ML are combined in complex optimization ment in 5G/B5G networks, the classical hierarchical structure
problems, as well. In this context, resource allocation, of a cellular network is proposed to change in order to become

VOLUME 10, 2022 83519


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

mainly used in user, subcarrier, power allocation and CSI


prediction tasks [86], [87]. The multiparameter nature of the
RRM problem and the complex channel feature associations
render DL approaches as the most efficient way to deal with
the total RRM problem [83]–[85].
On the other hand, Unsupervised Learning focuses, in gen-
eral, on clustering: the corresponding models are efficient in
user grouping, BS or RN selection and QoS levels formula-
tion, concerning RRM tasks [88]–[92].
RL models -as DL ones do- are more efficient dealing
with the NP-Hard problem of the overall resource allocation.
In this framework, RL approaches, such as Q-learning, are
proposed by researchers in joint user, subcarrier allocation
and energy consumption minimization problems [93]–[99].
Finally, MEC and FL methods, which refer to the most
recent evolution in the field, are proposed to face the challeng-
ing issue of training time minimization, latency minimization
and computational resources optimization [102]–[111].
From the above analysis and Table 2, a categorization
FIGURE 6. FL in 5G/B5G networks.
of the best performing ML algorithms for each RRM-
related sub-problem is visible. As presented in Section II,
the NP-hardness of the joint subcarrier allocation and power
more flexible and decentralized. In this framework, O-RAN control with respect to QoS, QoE constraints has led recent
and CRAN architectures, analyzed in Section II, are about to research efforts to deploy more intelligent solutions, which
efficiently satisfy the joint requirements of increased through- have the ability to communicate with the cellular environ-
put levels with respect to QoS and QoE standards, and also to ment, and change their predictions and decisions (DL, RL,
the concept of low-energy green networks. With respect to the FL methods), based on the current conditions. However,
aforementioned considerations, the authors in [110] proposed the existence of big data in transmission systems and wire-
a deep Q-learning framework in CRAN to maximize EE sub- less networks necessitates the utilization of classical ML
ject to the constraints analyzed in Section II-A. As previously approaches, such as supervised ones, specifically in order to
stated, the Q-learning method uses past learning experience tackle problems where the knowledge of a KPI and/or CSI
to predict future effects and make reward/penalty decisions. is vital for low latency responses and fast decision making
However, sometimes action overestimation generates lower (e.g. for coding and/or modulation scheme selection in each
probability limits for the maximum Q-value. With the use timeslot).
of a double Q-learning model, the target Q-value gener- Despite the growing activity on ML usage in resource allo-
ation leaded to bigger energy savings, whereas numerical cation, the existence of several limitations and open issues,
evaluation indicated that the method reduces by 22% and that will be analyzed in the next section, motivate further
also, improves EE at the same rate. Considering an O-RAN research.
architecture, the authors in [111] propose an RL based RRM
solution and deployed it in the ecosystem. The O-RAN Dis- IV. SIMULATIONS AND COMPARISON
tributed Unit sends periodically reports to the O-RAN Inter- In this section, the performance of various ML algorithms for
face and a dynamic per-flow resource allocation strategy is KPI prediction is presented. The investigated ML algorithms
employed to set the modulation and coding scheme, accord- have been selected based on two criteria. The first crite-
ing to KPI requirements. rion is their ability to satisfactorily solve the KPI prediction
problem. This means that we have selected algorithms with
F. SUMMARY—COMMENTS performance scores over 75%. The second criterion is the
Table 2 summarizes the usage of ML in 5G/B5G RRM prob- usage of these algorithms in RRM-related KPI prediction
lems, and groups accordingly the research papers presented task in 5G/B5G networks, according to the presented lit-
in sub-sections A÷D. erature in the previous sections (i.e. [79]–[81], [89]–[91]).
As already stated in section III and verified by Table 2, More specifically, using the Lumos-5G dataset [112], the
Supervised Learning techniques are mainly used for predic- problem of throughput prediction is investigated (Lumos5G
tion purposes. Indeed, various networks’ KPIs (throughput, features are summarized in Table 3). The dataset contains
SNIR, pathloss) can be effectively predicted, in order to 68,118 observations of 19 features, concerning UEs’ loca-
empower allocation strategies [80]–[82]. DL methods, due to tion and mobility parameters, such as longitude, latitude,
their ability to mine deep data and label associations through UE speed and direction, UE-BS distance and corresponding
multiple complex hidden layers (ANNs, DNNs, CNNs), are angles, as well as network related ones, such as network status

83520 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

(connected or not), CSI parameters (Received Signal Strength


Indicator – RSSI, Reference Signal Received Power – RSRP,
Reference Signal Received Quality – RSRQ, SNIR), and
signal strength, derived by real-world experiments and sta-
tistical analysis. The measured downlink throughput acts as
the response variable. Throughput prediction is formulated,
either as a classification or as a regression problem. On the
one hand, classification refers to the prediction of the received
throughput level by each active UE, given the dataset features.
The effective solution of this problem can be valuable in
a variety of RRM-related tasks, such as modulation levels
definition. FIGURE 7. DNN’s architecture.
On the other hand, regression refers to the prediction of
TABLE 3. Lumos5G features.
the actual expected value of the metric (throughput in our
case). The information gathered by the regression task can
be valuable in RRM decision tasks, such as subcarrier and/or
power allocation, via the prediction of the values for next
timeslots.
Considering throughput prediction as a classification prob-
lem, two different approaches are considered in our analysis.
The first one concerns three preselected throughput levels
(3 classes):
• Level 0 – low throughput: from 0 to 300 Mbps,
• Level 1 – medium throughput: from 300 to 500 Mbps,
and
• Level 2 – high throughput: above 500 Mbps.
However, due to the small amount of data in the second
class of the previous approach, we consider also an alter-
nate approach, where two preselected throughput levels exist
(2 classes): structure for the 3-class problem is shown in Fig. 7.
• Level 0 – low throughput: from 0 to 300 Mbps, DNN’s structure for the 2-class problem is similar and
• Level 1 – medium throughput: above 300 Mbps. differs only in the size of the two last layers (fully
The above-presented level limit values -in both 2-class and connected layer 2, soft-max layer).
3-class approaches- have been generated after performing In both of the abovementioned approaches, an 80%-20%
extensive statistical analysis to the used dataset, concerning training-test set split has been used, as well as a 10-fold
the goal of including satisfactory samples in each investigated cross validation procedure. The performance of the above-
class. Thus, we examine four distinct ML-based algorithms: mentioned classifiers is evaluated, using the accuracy and
• FFNN: A Feedforward NN with 100 hidden layers and F1-score metrics. Accuracy is the percentage of the total
rectified sigmoid activation function (ReLU) and opti- number of the correct predictions divided by the total number
mized hyperparameters, of observations. In other words, accuracy is the sum of True
• k-NN: A k-NN-based classifier using 2 neighbors and Positive (TP) and True Negative (TN) predictions, divided by
Chebyshev distance criterion, the number of the total predictions (TP + TN + False Positive
• SVMs: Two SVM models, one using polynomial and (FP) + False Negative (FN)). Then, F1-score is given by the
another using Gaussian kernel and following formula:
• DNN: A Deep NN with a feature input layer -using the TP TP
TP+FP · TP+FN
19 features of the dataset- and z-score normalization, F1 = 2 · TP TP
(6)
a fully connected layer with 19 × 50 weight matrix TP+FP + TP+FN
and a 50-element vector output, a 50-channel batch- Table 4 summarizes the performance of the above models
normalization layer, a ReLU layer with a 50-element in the classification task (with two or three classes), based
vector output, a second fully connected layer with 3 or 2 on classification accuracy and F1-score. The k-NN-based
(3-class and 2-class problem respectively) neurons and approach overperforms all the other approaches, witnessing
50 × 3 (3-class problem) or 50 × 2 (2-class prob- the best overall accuracy (0.87 and 0.90 with three and
lem) weight matrix and a 3-element/2-element vector two classes, respectively). In general, supervised learning
output and, finally, a soft-maximization layer with a algorithms (such as k-NN) are the most appropriate ones
3-element/2-element vector output. The overall DNN’s in networks’ KPIs prediction, as drawn from the existing

VOLUME 10, 2022 83521


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

TABLE 2. Research work on ML techniques in 5G/B5G RRM.

literature, analyzed in subsection E of section III. However, For each of the [113]–[115] works, we pick the best per-
concerning F1-score, DNN has the best performance (0.81) in forming ML algorithm, and so we do for our evaluation
the 3-class problem, while k-NN (0.90) in the 2-class model. approach, as far as the 3-class throughput prediction problem
As stated in previous paragraphs, DL algorithms, due to their is concerned (i.e., k-NN algorithm, see Table 4). As it is
multiple hidden layer architecture, witness unseen aspects of apparent, our evaluation approach is consistent with similar
the dataset, and, thus, their performance is satisfactory in the approaches in other recent works [113]–[115].
classification task. In this case, the preselected classes are Considering throughput prediction as a regression
imbalanced. Therefore, F1 metric is more reliable, because problem, the following algorithms are examined:
it concerns both TP, TN and FP, FN, while accuracy takes • Linear regression: A multi-linear regression model,
into account only TP, TN. It is also visible from Table 4, using all 19 dataset features except throughput, which
that, using only two classes, both accuracy and F1-metrics is the response variable,
are improved. Moreover, with respect to the training time • Binary Decision tree: A Gaussian binary decision tree
of each ML model we observe that k-NN overperforms the designed for regression purposes, using auto-optimized
other approaches, while the DNN approach reaches almost hyperparameters,
the same performance levels. Thus, these two ML methods • SVMs: Two SVM models, one using polynomial and
are the most appropriate for the investigated problem in both another using Gaussian kernel and
performance and training time perspective. On the other hand, • NN: A Feed Forward neural network with 100 hidden
FFNN approach has significant delay in training time, even layers, a feature input layer with the 22 features of the
though the performance accuracy almost coincides to the dataset and z-score normalization, a 50×50 fully con-
best-performing algorithm’s one. nected layer, a 50-channel batch-normalization layer,
a ReLU layer, a soft-maximization layer and a regression
TABLE 4. ML Classification algorithms comparison. layer.
• LSTM: A LSTM neural network with a sequence input
layer for the 22 features of the dataset, an LSTM layer
with 125 hidden units, a fully connected layer and a
regression layer.
Similarly to the investigation of the problem as a classifi-
cation one, an 80%-20% training-test set split is used, as well
as a 10-fold cross validation procedure. The performance of
Figs. 8, 9 depict the comparison of selected state-of-the-art the abovementioned ML models is evaluated using the mean
throughput classification approaches [113]–[115] while the absolute error (MAE) and RMSE metrics. MAE is defined as
previously presented evaluation analysis is included as well. the difference between the actual and the predicted values of

83522 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

TABLE 5. ML Regression algorithms comparison.

FIGURE 8. Classification models comparison: accuracy.

FIGURE 10. Regression models: MAE.

conducted using RMSE as metric. As it is apparent, our


evaluation approach is consistent with the approaches in other
recent works [114].
To conclude, we observe that, in general, both our
approaches and other recent works on the KPI prediction
problem for 5B/B5G networks propose Supervised or DL
FIGURE 9. Classification models comparison: F1-score.
models as the most appropriate tools for this type of problem
either as a classification or a regression one. On the one hand,
supervised learning models (k-NN, SVMs, Random Forest)
the response variable (throughput), while RMSE is defined as seem to have the best performance concerning training time.
the square root of the squared difference between the actual But on the other hand, DL (DNNs, LSTM) models overper-
and predicted values. form when it comes to performance metrics, such as accuracy
Table 5 and Figs. 10, 11 summarize the performance of and F1-score for classification purposes or RMSE, MAE for
the above models in the regression task, based on MAE regression ones.
and RMSE. The two best performing ML-based approaches
are Binary Tree regressor and LSTM regressor, witnessing V. DISCUSSION AND OPEN ISSUES
the best overall MAE and RMSE performance (162,257 and As already stated, the allocation of the available net-
150, 250 respectively). As in the previous case (classifica- work resources is a multi-objective problem, due to the
tion problem), supervised and Deep learning algorithms are diverse nature of users’ requirements, hardware evolution
the most appropriate ones in networks’ KPIs prediction as and demand for continuous connectivity. Despite the research
a regression problem. In fact, decision tree algorithms and progress presented in section IV, some open questions and
linear regressors are designed for regression purposes. How- practical challenges persist, requiring even more effort in
ever, NN model’s performance is also highlighted, as it is the the field of ML-based RRM, to reach its full potential. The
second best in both metrics (237 and 328, respectively). critical issues that should be taken into consideration are
Fig. 12 depicts the comparison of the state-of-the-art highlighted below and summarized in Table 6.
throughput prediction approach in [113] with our previously 1) 5G and B5G networks utilize ML-based algorithms
presented evaluation analysis for the regression problem. to phase the growing number of usage scenarios in
We pick the best performing regression ML algorithm access management. Therefore, ML performance met-
of [114], and so we do for our evaluation approach. rics (such as RMSE for regression problems, accu-
(i.e., LSTM regressor, see Table 5). The comparison is racy for classification ones, etc.) should be examined

VOLUME 10, 2022 83523


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

TABLE 6. Open issues and potential solutions concerning ML employment in RRM.

• 5G datasets unavailability and/or poor quality:


A key procedure for building ML models is the
validation and training stage. 5G full deployment
throughout the world was set for 2020, before
the COVID-19 pandemic. Hence, 5G data from
implemented networks have only recently started
to be produced. The AI/ML models, that have been
produced until now, are using synthetic or incom-
plete data from past networks’ generations [101].
Another aspect that also affects data quality is the
fact that, in general, wireless network data are
characterized by noise and inaccuracy. In fact, even
well-established wireless network datasets -such
as DeepMIMO [116]- witness quality issues in a
variety of RRM-related problems. We should also
FIGURE 11. Regression models: RMSE. keep in mind that, due to the highly interferenced
environment, huge datasets, including numerous
features and observations, are, anyway, required.
All the data-related limitations presented in this
paragraph, prevent ML models from reaching high
levels of accuracy; lack of input leads to sub-
optimal or non-optimal solutions. This considera-
tion reflects every ML-based model, regardless the
type of learning. Both supervised, unsupervised,
reinforcement or deep learning approaches have
insufficient results when the quality of the input
data is moderate.
• Learning difficulties due to channel complexity
in multiuser environments: 5G wireless networks
are characterized by multipath propagation in a
highly interferenced environment. This, as stated
previously, consists one of the reasons for the need
FIGURE 12. Regression models comparison: RMSE. for an enormous variety of features and channel
observations in ML datasets construction for RRM
(preferably Big Data). Hence, feature extraction
along with the network metrics (i.e., total network for channel information becomes a demanding
throughput, QoE, etc) [80], [88]. Some approaches task. Linear models and generic algorithms (such
(e.g. [79], [85]) focus only on the ML metrics perfor- as simple-tree models, regressions, etc.) are unable
mance increase, without evaluating also the networks’ to provide optimal solutions, concerning effec-
metrics. tive resource allocation. The approaches discussed
2) Throughout this manuscript, we have presented the in previous sections configure ML-algorithms by
critical role that AI/ML plays in wireless networks and alternating hyperparameters and evaluate accuracy
in IoT and heterogenous topologies, in general. How- in the RRM sub-problems. In this context, per-
ever, researchers should not overlook some practical formance and models’ selection policies are vital
limitations that exist in the implementation process of in ML-based approaches. Researchers should have
ML-based RRM strategies, i.e., when developing the deep knowledge of the ML models, pre-trained
corresponding ML model. More specifically: or not, so that they become able to correctly

83524 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

evaluate them [117]. Concerning the complexity approaches can overcome limitations that existing (non-ML)
of the channel and the growing users’ demand in approaches could not, such as non conventionality and real-
5G/B5G networks, DL methods are proposed as time integration.
the more efficient ones. Furthermore, we highlight the open issues and limitations
• Computational complexity: In terms of accuracy, of ML-based RRM and, thus, propose guidelines for other
the AI/ML models discussed in previous sec- research efforts in the field. In this context, we point out that,
tions have improved performance, when used to 5G datasets unavailability or poor quality, complex channel
solve complex problems based on networks’ KPIs. and high levels of interference, computational complexity and
Concerning the URLLC requirements and the increased energy consumption, are of most importance in the
demand for mass access to the medium in 5G/B5G process of building AI/ML models.
networks, RRM decision making should be done Finally, in order to demonstrate effectiveness of ML-based
with respect to computational complexity. How- RRM, and also empower the effectiveness of ML algorithms
ever, the highly interferenced environment and ran- in various RRM sub-problems, we investigate via simula-
dom mobility patterns of UEs act in the opposite tions the problem of throughput prediction, treated either
direction. Thus, ML techniques should succeed as a regression or as a classification one. According to the
in proposing a trade-off between the solution’s presented results, supervised learning approaches (k-NNs,
accuracy and computation requirements [82], [90], decision trees etc.) overperform in terms of training time,
[98]. Even though DL solutions are proposed as the while DL ones overperform in terms of ML performance
most efficient ones, they increase computational KPIs (accuracy, F1-score, RMSE, MAE). Results evaluation
complexity, by employing multiple hidden layers is consistent with other state-of-the-art approaches.
to yield accurate results. In this respect, distributed
approaches using MEC architectures and FL-based REFERENCES
algorithms should be considered. Taking also into [1] C.-X. Wang, F. Haider, X. Gao, X.-H. You, Y. Yang, D. Yuan,
account the requirement for energy efficient net- H. M. Aggoune, H. Haas, S. Fletcher, and E. Hepsaydir, ‘‘Cellular archi-
tecture and key technologies for 5G wireless communication networks,’’
works, researchers should maintain the computa- IEEE Commun. Mag., vol. 52, no. 2, pp. 30–122, Feb. 2014.
tional cost to tolerable levels [83], [88]. [2] M. Agiwal, H. Kwon, S. Park, and H. Jin, ‘‘A survey on 4G–5G
3) Power consumption rapidly increases in 5G, and dual connectivity: Road to 5G implementation,’’ IEEE Access, vol. 9,
will further increase in B5G networks, compared pp. 16193–16210, 2021.
[3] Cisco Visual Networking Index: Global Mobile Data Traffic Forecast
to previous generations, due to the users’ growing Update 2017–2022, Cisco, San Jose, CA, USA, Feb. 2019.
demands for continuous access to enhanced services [4] M. Shafi, A. F. Molisch, P. J. Smith, T. Haustein, P. Zhu, P. De Silva,
and applications. ML schemes, if effectively imple- F. Tufvesson, A. Benjebbour, and G. Wunder, ‘‘5G: A tutorial overview
of standards, trials, challenges, deployment, and practice,’’ IEEE J. Sel.
mented, contribute in power savings, as, hopefully, Areas Commun., vol. 35, no. 6, pp. 1201–1221, Jun. 2017.
they lead eventually to fast and more accurate RRM [5] T. T. T. Le and S. Moh, ‘‘Comprehensive survey of radio resource allo-
decision-making. For further energy consumption mit- cation schemes for 5G V2X communications,’’ IEEE Access, vol. 9,
pp. 123117–123133, 2021.
igation, we should incorporate energy-efficient tech- [6] S. Penchala, D. K. Nayak, and B. Ramadevi, ‘‘Survey on massive MIMO
nologies during the models’ training phase, where system with underlaid D2D communication,’’ in Intelligent System Design.
additional computational resources are needed. In this Singapore: Springer, 2021, pp. 453–462.
direction, Green AI techniques and distributed process- [7] Y. Mehmood, N. Haider, M. Imran, A. Timm-Giel, and M. Guizani, ‘‘M2M
communications in 5G: State-of-the-art architecture, recent advances, and
ing methods (such as MEC) should be further investi- research challenges,’’ IEEE Commun. Mag., vol. 55, no. 9, pp. 194–201,
gated, so that less energy harvesting solutions become Sep. 2017.
feasible [38]. [8] Q.-V. Pham, F. Fang, V. N. Ha, M. J. Piran, M. Le, L. B. Le, H. Won-Joo,
and Z. Ding, ‘‘A survey of multi-access edge computing in 5G and beyond:
Fundamentals, technology integration, and state-of-the-art,’’ IEEE Access,
VI. CONCLUDING REMARKS vol. 8, pp. 116974–117017, 2020.
This article presents a state-of-the-art analysis concerning the [9] N. T. Le, M. A. Hossain, A. Islam, D.-Y. Kim, Y.-J. Choi, and Y. M. Jang,
‘‘Survey of promising technologies for 5G networks,’’ Mobile Inf. Syst.,
deployment of ML-based approaches in the context of effi- vol. 2016, pp. 1–25, Oct. 2016.
cient RRM in 5G/B5G wireless networks. A categorization of [10] S. K. Goudos, P. I. Dallas, S. Chatziefthymiou, and S. Kyriazakos,
these approaches, based on the type of learning, is provided, ‘‘A survey of IoT key enabling and future technologies: 5G, mobile IoT,
sematic web and applications,’’ Wireless Pers. Commun., vol. 97, no. 2,
in order to point out which ML algorithms should be used in pp. 1645–1675, Nov. 2017.
different RRM sub-problems (e.g., unsupervised clustering [11] R. Ali, Y. B. Zikria, A. K. Bashir, S. Garg, and H. S. Kim, ‘‘URLLC for 5G
algorithms in RN selection, DNNs in subcarrier allocation and beyond: Requirements, enabling incumbent technologies and network
intelligence,’’ IEEE Access, vol. 9, pp. 67064–67095, 2021.
and power management, etc.). Moreover, we emphasize the
[12] A. Osseiran, F. Boccardi, V. Braun, K. Kusume, P. Marsch, M. Maternia,
need for cooperation and coexistence between AI/ML-based O. Queseth, M. Schellmann, H. Schotten, H. Taoka, H. Tullberg,
RRM and distributed approaches, due to the multiparameter M. A. Uusitalo, B. Timus, and M. Fallgren, ‘‘Scenarios for 5G mobile
nature of the problem, by presenting MEC and FL as possi- and wireless communications: The vision of the METIS project,’’ IEEE
Commun. Mag., vol. 52, no. 5, pp. 26–35, May 2014.
ble solutions, which improve a variety of network and user [13] A. Gupta and R. K. Jha, ‘‘A survey of 5G network: Architecture and
KPIs. Based on the above, we conclude that ML-enabled emerging technologies,’’ IEEE Access, vol. 3, pp. 1206–1232, 2015.

VOLUME 10, 2022 83525


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

[14] M. U. Khan, A. Garcia-Armada, and J. J. Escudero-Garzas, ‘‘Service- [36] Y. Guo, F.-C. Zheng, J. Luo, and X. Wang, ‘‘Optimal resource
based network dimensioning for 5G networks assisted by real data,’’ IEEE allocation via machine learning in coordinated downlink multi-cell
Access, vol. 8, pp. 129193–129212, 2020. OFDM networks under high mobility,’’ in Proc. IEEE 93rd Veh. Tech-
[15] R. Dilli, ‘‘Analysis of 5G wireless systems in FR1 and FR2 frequency nol. Conf. (VTC-Spring), Elsinki, Finalnd, Apr. 2021, pp. 1–7, doi:
bands,’’ in Proc. 2nd Int. Conf. Innov. Mech. Ind. Appl. (ICIMIA), 10.1109/VTC2021-Spring51267.2021.9448996.
Mar. 2020, pp. 767–772, doi: 10.1109/ICIMIA48430.2020.9074973. [37] T. E. Bogale, X. Wang, and L. B. Le, ‘‘Machine intelligence techniques
[16] F. A. Dicandia and S. Genovesi, ‘‘Exploitation of triangular lattice arrays for next-generation context-aware wireless networks,’’ ICT Discoveries J.,
for improved spectral efficiency in massive MIMO 5G systems,’’ IEEE Impact Artif. Intell. (AI) Commun. Netw. Services, ITU, Geneva, Switzer-
Access, vol. 9, pp. 17530–17543, 2021. land, 2018, no. 1. [Online]. Available: https://www.itu.int/dms_pub/itu-
[17] Y. Niu, Y. Li, D. Jin, L. Su, and A. V. Vasilakos, ‘‘A survey of millimeter s/opb/journal/SJOURNAL-ICTF.VOL1-2018-1-P13-PDF-E.pdf
wave communications (mmWave) for 5G: Opportunities and challenges,’’ [38] D. Lopez-Perez, ‘‘A survey on 5G energy efficiency: Massive MIMO,
Wireless Netw., vol. 21, no. 8, pp. 2657–2676, Nov. 2015. lean carrier design, sleep modes, and machine learning,’’ 2021,
[18] S. Wijethilaka and M. Liyanage, ‘‘Survey on network slicing for Internet of arXiv:2101.11246.
Things realization in 5G networks,’’ IEEE Commun. Surveys Tuts., vol. 23, [39] J. Kaur, M. A. Khan, M. Iftikhar, M. Imran, and Q. E. U. Haq,
no. 2, pp. 957–994, 2nd Quart., 2021. ‘‘Machine learning techniques for 5G and beyond,’’ IEEE Access, vol. 9,
[19] Y. Xu, G. Gui, H. Gacanin, and F. Adachi, ‘‘A survey on resource alloca- pp. 23472–23488, 2021.
tion for 5G heterogeneous networks: Current research, future trends, and [40] A. Ly and Y.-D. Yao, ‘‘A review of deep learning in 5G research: Channel
challenges,’’ IEEE Commun. Surveys Tuts., vol. 23, no. 2, pp. 668–695, coding, massive MIMO, multiple access, resource allocation, and network
2nd Quart., 2021. security,’’ IEEE Open J. Commun. Soc., vol. 2, pp. 396–408, 2021.
[20] M. Hassan, M. Singh, and K. Hamid, ‘‘Survey on NOMA and spectrum [41] S. Kayyali, ‘‘Resource management and quality of service provisioning in
sharing techniques in 5G,’’ in Proc. IEEE Int. Conf. Smart Inf. Syst. Tech- 5G cellular networks,’’ 2020, arXiv:2008.09601.
nol. (SIST), Apr. 2021, pp. 1–4, doi: 10.1109/SIST50301.2021.9465962. [42] J. H. Joloudari, R. Alizadehsani, I. Nodehi, S. Mojrian, F. Fazl,
[21] F. D. Calabrese, L. Wang, E. Ghadimi, G. Peters, L. Hanzo, and P. Soldati, S. K. Shirkharkolaie, H. M. D. Kabir, R.-S. Tan, and U. R. Acharya,
‘‘Learning radio resource management in RANs: Framework, opportuni- ‘‘Resource allocation optimization using artificial intelligence methods in
ties, and challenges,’’ IEEE Commun. Mag., vol. 56, no. 9, pp. 138–145, various computing paradigms: A review,’’ 2022, arXiv:2203.12315.
Sep. 2018. [43] B. Brik, K. Boutiba, and A. Ksentini, ‘‘Deep learning for B5G open radio
[22] A. F. Molisch, V. V. Ratnam, S. Han, Z. Li, S. Le Hong Nguyen, L. Li, and access network: Evolution, survey, case studies, and challenges,’’ IEEE
K. Haneda, ‘‘Hybrid beamforming for massive MIMO: A survey,’’ IEEE Open J. Commun. Soc., vol. 3, pp. 228–250, 2022.
Commun. Mag., vol. 55, no. 9, pp. 134–141, Sep. 2017. [44] O. Nassef, W. Sun, H. Purmehdi, M. Tatipamula, and T. Mahmoodi,
[23] Y. Fu, S. Wang, C. Wang, X. Hong, and S. McLaughlin, ‘‘Artificial intel- ‘‘A survey: Distributed machine learning for 5G and beyond,’’ Comput.
ligence to manage network traffic of 5G wireless networks,’’ IEEE/ACM Netw., vol. 207, Apr. 2022, Art. no. 108820.
Trans. Netw., vol. 32, no. 6, pp. 58–64, Nov./Dec. 2018. [45] F. Schaich, T. Wild, and R. Ahmed, ‘‘Subcarrier spacing—How to
[24] R. Li, Z. Zhao, X. Zhou, G. Ding, Y. Chen, Z. Wang, and H. Zhang, make use of this degree of freedom,’’ in Proc. IEEE 83rd Veh. Tech-
‘‘Intelligent 5G: When cellular networks meet artificial intelligence,’’ nol. Conf. (VTC Spring), Nanjing, China, May 2016, pp. 1–6, doi:
IEEE Wireless Commun., vol. 24, no. 5, pp. 175–183, Oct. 2017. 10.1109/VTCSpring.2016.7504496.
[25] C.-X. Wang, M. D. Renzo, S. Stanczak, S. Wang, and E. G. Larsson, [46] P.-H. Huang, Y. Gai, B. Krishnamachari, and A. Sridharan, ‘‘Subcarrier
‘‘Artificial intelligence enabled wireless networking for 5G and beyond: allocation in multiuser OFDM systems: Complexity and approximability,’’
Recent advances and future challenges,’’ IEEE Wireless Commun., vol. 27, in Proc. IEEE Wireless Commun. Netw. Conf., Sydney, NSW, Australia,
no. 1, pp. 16–23, Feb. 2020. Apr. 2010, pp. 1–6, doi: 10.1109/WCNC.2010.5506244.
[26] AI and ML—Enablers for Beyond 5G Networks, 5G PPP Technol. Board, [47] S. Oulaourf, A. Haidine, and H. Ouahmane, ‘‘Review on using game theory
Sophia Antipolis, France, May 11, 2021, doi: 10.5281/zenodo.4299895. in resource allocation for LTE/LTE-advanced,’’ in Proc. Int. Conf. Adv.
[27] C. Zhang, P. Patras, and H. Haddadi, ‘‘Deep learning in mobile and wireless Commun. Syst. Inf. Secur. (ACOSIS), Marrakesh, Morocco, Oct. 2016,
networking: A survey,’’ IEEE Commun. Surveys Tuts., vol. 21, no. 3, pp. 1–7, doi: 10.1109/ACOSIS.2016.7843946.
pp. 2224–2287, 3rd Quart., 2019. [48] W. Ejaz, S. K. Sharma, S. Saadat, M. Naeem, A. Anpalagan, and
[28] I. Alawe, A. Ksentini, Y. Hadjadj-Aoul, and P. Bertin, ‘‘Improving N. A. Chughtai, ‘‘A comprehensive survey on resource allocation for
traffic forecasting for 5G core network scalability: A machine learn- CRAN in 5G and beyond networks,’’ J. Netw. Comput. Appl., vol. 160,
ing approach,’’ IEEE/ACM Trans. Netw., vol. 32, no. 6, pp. 42–49, Jun. 2020, Art. no. 102638.
Nov./Dec. 2018. [49] P. K. Gkonis, M. A. Seimeni, N. P. Asimakis, D. I. Kaklamani, and
[29] R. Alvizu, S. Troia, G. Maier, and A. Pattavina, ‘‘Matheuristic with I. S. Venieris, ‘‘A new subcarrier allocation strategy for MIMO-OFDMA
machine-learning-based prediction for software-defined mobile metro- multicellular networks based on cooperative interference mitigation,’’ Sci.
core networks,’’ J. Opt. Commun. Netw., vol. 9, no. 9, pp. D19–D30, World J., vol. 2014, pp. 1–9, Jan. 2014, doi: 10.1155/2014/652968.
Sep. 2017. [50] 5G -Study on Channel Model for Frequencies From 0.5 to 100 GHz,
[30] M. H. Abidi, H. Alkhalefah, K. Moiduddin, M. Alazab, M. K. Mohammed, document ETSI TR 138 901 V17.1.0, 2020.
W. Ameen, and T. R. Gadekallu, ‘‘Optimal 5G network slicing using [51] J. Ghosh, V. Sharma, H. Haci, S. Singh, and I.-H. Ra, ‘‘Performance
machine learning and deep learning concepts,’’ Comput. Standards Inter- investigation of NOMA versus OMA techniques for mmWave mas-
faces, vol. 76, Jun. 2021, Art. no. 103518. sive MIMO communications,’’ IEEE Access, vol. 9, pp. 125300–125308,
[31] C. Benzaid and T. Taleb, ‘‘AI for beyond 5G networks: A cyber-security 2021.
defense or offense enabler?’’ IEEE Netw., vol. 34, no. 6, pp. 140–147, [52] V. N. Ha, L. B. Le, and N.-D. Dao, ‘‘Cooperative transmission in cloud
Nov./Dec. 2020. RAN considering fronthaul capacity and cloud processing constraints,’’
[32] M. Yan, G. Feng, J. Zhou, and S. Qin, ‘‘Smart multi-RAT access based on in Proc. IEEE Wireless Commun. Netw. Conf. (WCNC), Istanbul, Turkey,
multiagent reinforcement learning,’’ IEEE Trans. Veh. Technol., vol. 67, Apr. 2014, pp. 1862–1867, doi: 10.1109/WCNC.2014.6952553.
no. 5, pp. 4539–4551, May 2018. [53] Z. Wang, H. Li, H. Wang, and S. Ci, ‘‘Probability weighted based
[33] J. Pérez-Romero, J. Sánchez-González, R. Agustí, B. Lorenzo, and spectral resources allocation algorithm in Hetnet under Cloud-RAN
S. Glisic, ‘‘Power-efficient resource allocation in a heterogeneous network architecture,’’ in Proc. IEEE/CIC Int. Conf. Commun. China-
with cellular and D2D capabilities,’’ IEEE Trans. Veh. Technol., vol. 65, Workshops (CIC/ICCC), Shanghai, Chine, Aug. 2013, pp. 88–92,
no. 11, pp. 9272–9286, Nov. 2016. doi: 10.1109/ICCChinaW.2013.6670573.
[34] S. A. R. Naqvi, H. Pervaiz, S. A. Hassan, L. Musavian, Q. Ni, M. A. Imran, [54] W. U. Rehman, T. Salam, A. Almogren, K. Haseeb, I. U. Din, and
X. Ge, and R. Tafazolli, ‘‘Energy-aware radio resource management in S. H. Bouk, ‘‘Improved resource allocation in 5G MTC networks,’’ IEEE
D2D-enabled multi-tier HetNets,’’ IEEE Access, vol. 6, pp. 16610–16622, Access, vol. 8, pp. 49187–49197, 2020.
2018. [55] R. D. Mardian, M. Suryanegara, and K. Ramli, ‘‘Measuring quality
[35] S. Imtiaz, S. Schiessl, G. P. Koudouridis, and J. Gross, ‘‘Coordinates-based of service (QoS) and quality of experience (QoE) on 5G technology:
resource allocation through supervised machine learning,’’ IEEE Trans. A review,’’ in Proc. IEEE Int. Conf. Innov. Res. Develop. (ICIRD), Jakarta,
Cognit. Commun. Netw., vol. 7, no. 4, pp. 1347–1362, Dec. 2021. Indonesia, Jun. 2019, pp. 1–6, doi: 10.1109/ICIRD47319.2019.9074681.

83526 VOLUME 10, 2022


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

[56] H. Beshley, M. Beshley, M. Medvetskyi, and J. Pyrih, ‘‘QoS-aware optimal [78] M. E. M. Cayamcela and W. Lim, ‘‘Artificial intelligence in 5G
radio resource allocation method for machine-type communications in 5G technology: A survey,’’ in Proc. Int. Conf. Inf. Commun. Technol.
LTE and beyond cellular networks,’’ Wireless Commun. Mobile Comput., Converg. (ICTC), Jeju-Si, South Korea, Oct. 2018, pp. 860–865, doi:
vol. 2021, pp. 1–18, May 2021. 10.1109/ICTC.2018.8539642.
[57] N. Wang, Z. Fei, and J. Kuang, ‘‘QoE-aware resource allocation for mixed [79] M. E. Morocho-Cayamcela, H. Lee, and W. Lim, ‘‘Machine learn-
traffics in heterogeneous networks based on Kuhn-Munkres algorithm,’’ ing for 5G/B5G mobile and wireless communications: Potential, limita-
in Proc. IEEE Int. Conf. Commun. Syst. (ICCS), Bangkok, Thailand, tions, and future directions,’’ IEEE Access, vol. 7, pp. 137184–137206,
Dec. 2016, pp. 1–6, doi: 10.1109/ICCS.2016.7833650. 2019.
[58] J. Jia, Y. Xu, Z. Du, J. Chen, Q. Wang, and X. Wang, ‘‘Joint resource [80] B. Bojović, E. Meshkova, N. Baldo, J. Riihijärvi, and M. Petrova,
allocation for QoE optimization in large-scale NOMA-enabled multi- ‘‘Machine learning-based dynamic frequency and bandwidth allocation in
cell networks,’’ Peer-Peer Netw. Appl., vol. 15, no. 1, pp. 689–702, self-organized LTE dense small cell deployments,’’ EURASIP J. Wireless
Jan. 2022. Commun. Netw., vol. 2016, no. 1, pp. 1–16, Dec. 2016.
[59] R. I. Ansari, H. Pervaiz, S. A. Hassan, C. Chrysostomou, M. A. Imran, [81] A. Martin, J. Egana, J. Florez, J. Montalban, I. G. Olaizola, M. Quartulli,
S. Mumtaz, and R. Tafazolli, ‘‘A new dimension to spectrum management R. Viola, and M. Zorrilla, ‘‘Network resource allocation system for QoE-
in IoT empowered 5G networks,’’ IEEE Netw., vol. 33, no. 4, pp. 186–193, aware delivery of media services in 5G networks,’’ IEEE Trans. Broadcast.,
Jul. 2019. vol. 64, no. 2, pp. 561–574, Jun. 2018.
[60] A. Celik, R. M. Radaydeh, F. S. Al-Qahtani, and M.-S. Alouini, ‘‘Joint [82] R. D. A. Timoteo, D. Cunha, and G. D. C. Cavalcanti, ‘‘A proposal for
interference management and resource allocation for device-to-device path loss prediction in urban environments using support vector regres-
(D2D) communications underlying downlink/uplink decoupled (DUDe) sion,’’ in Proc. Adv. Int. Conf. Telecommun., vol. 10. Paris, France, 2014,
heterogeneous networks,’’ in Proc. IEEE Int. Conf. Commun. (ICC), Paris, pp. 119–124.
France, May 2017, pp. 1–6, doi: 10.1109/ICC.2017.7996667. [83] J. Liu, R. Deng, S. Zhou, and Z. Niu, ‘‘Seeing the unobservable: Channel
[61] X. Qi, S. Khattak, A. Zaib, and I. Khan, ‘‘Energy efficient resource learning for wireless communication networks,’’ in Proc. IEEE Global
allocation for 5G heterogeneous networks using genetic algorithm,’’ IEEE Commun. Conf. (GLOBECOM), San Diego, CA, USA, Dec. 2015, pp. 1–6,
Access, vol. 9, pp. 160510–160520, 2021. doi: 10.1109/GLOCOM.2015.7417805.
[62] Y. Xu, Y. Hu, G. Li, and H. Zhang, ‘‘Robust resource allocation for het- [84] H. Zhang, H. Zhang, K. Long, and G. K. Karagiannidis, ‘‘Deep learning
erogeneous wireless network: A worst-case optimisation,’’ IET Commun., based radio resource management in NOMA networks: User association,
vol. 12, no. 9, pp. 1064–1071, Jun. 2018. subchannel and power allocation,’’ IEEE Trans. Netw. Sci. Eng., vol. 7,
[63] R. Liu, Q. Chen, G. Yu, and G. Y. Li, ‘‘Joint user association and resource no. 4, pp. 2406–2415, Oct. 2020.
allocation for multi-band millimeter-wave heterogeneous networks,’’ IEEE [85] R. Guerra-Gomez, S. Ruiz-Boque, M. Garcia-Lozano, and J. O. Bonafe,
Trans. Commun., vol. 67, no. 12, pp. 8502–8516, Dec. 2019. ‘‘Machine learning adaptive computational capacity prediction for
[64] P. Ji, J. Jia, and J. Chen, ‘‘Joint optimization on both routing and resource dynamic resource management in C-RAN,’’ IEEE Access, vol. 8,
allocation for millimeter wave cellular networks,’’ IEEE Access, vol. 7, pp. 89130–89142, 2020.
pp. 93631–93642, 2019. [86] D. Anand, M. A. Togou, and G.-M. Muntean, ‘‘A machine learning
[65] F. Ye, J. Dai, and Y. B. Li, ‘‘Hybrid-clustering game algorithm for resource solution for automatic network selection to enhance quality of service
allocation in macro-femto HetNet,’’ KSII Trans. Internet Inf. Syst., vol. 12, for video delivery,’’ in Proc. IEEE Int. Symp. Broadband Multimedia
no. 4, pp. 1638–1654, Apr. 2018. Syst. Broadcast. (BMSB), Chengdu, China, Aug. 2021, pp. 1–5, doi:
[66] M. Rahman, Y. Lee, and I. Koo, ‘‘Energy-efficient power allocation and 10.1109/BMSB53066.2021.9547176.
relay selection schemes for relay-assisted D2D communications in 5G [87] M. M. Butt, A. Pantelidou, and I. Z. Kovacs, ‘‘ML-assisted UE positioning:
wireless networks,’’ Sensors, vol. 18, no. 9, p. 2865, Aug. 2018. Performance analysis and 5G architecture enhancements,’’ IEEE Open
[67] B. Xie, Z. Zhang, R. Q. Hu, G. Wu, and A. Papathanassiou, ‘‘Joint spectral J. Veh. Technol., vol. 2, pp. 377–388, 2021.
efficiency and energy efficiency in FFR-based wireless heterogeneous [88] W. Song, F. Zeng, J. Hu, Z. Wang, and X. Mao, ‘‘An unsupervised-
networks,’’ IEEE Trans. Veh. Technol., vol. 67, no. 9, pp. 8154–8168, learning-based method for multi-hop wireless broadcast relay
Sep. 2018. selection in urban vehicular networks,’’ in Proc. IEEE Veh. Technol.
[68] S. Kim, ‘‘4G/5G coexistent dynamic spectrum sharing scheme based Conf. (VTC), Sydney, NSW, Australia, Jun. 2017, pp. 1–5, doi:
on dual bargaining game approach,’’ Comput. Commun., vol. 181, 10.1109/VTCSpring.2017.8108458.
pp. 215–223, Jan. 2022. [89] L.-C. Wang and S.-H. Cheng, ‘‘Data-driven resource management for ultra-
[69] M. Bigdeli, S. Farahmand, B. Abolhassani, and H. H. Nguyen, ‘‘Glob- dense small cells: An affinity propagation clustering approach,’’ IEEE
ally optimal resource allocation and time scheduling in downlink cog- Trans. Netw. Sci. Eng., vol. 6, no. 3, pp. 267–279, Jul. 2019.
nitive CRAN favoring big data requests,’’ IEEE Access, vol. 10, [90] Z. Wang, M. Eisen, and A. Ribeiro, ‘‘Unsupervised learning for asyn-
pp. 27504–27521, 2022. chronous resource allocation in ad-hoc wireless networks,’’ in Proc.
[70] T. Pamuklu, S. Mollahasani, and M. Erol-Kantarci, ‘‘Energy-efficient and IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), Toronto, ON,
delay-guaranteed joint resource allocation and DU selection in O-RAN,’’ Canada, Jun. 2021, pp. 8143–8147, doi: 10.1109/ICASSP39728.2021.
in Proc. IEEE 4th 5G World Forum (5GWF), Oct. 2021, pp. 99–104. 9414181.
[71] F. Hussain, S. A. Hassan, R. Hussain, and E. Hossain, ‘‘Machine learning [91] M. A. Jamshed, F. Heliot, and T. W. C. Brown, ‘‘Unsupervised learn-
for resource management in cellular and IoT networks: Potentials, current ing based emission-aware uplink resource allocation scheme for non-
solutions, and open challenges,’’ IEEE Commun. Surveys Tuts., vol. 22, orthogonal multiple access systems,’’ IEEE Trans. Veh. Technol., vol. 70,
no. 2, pp. 1251–1275, 2nd Quart., 2020. no. 8, pp. 7681–7691, Aug. 2021.
[72] T. M. Mitchell, ‘‘Does machine learning really work?’’ Artif. Intell. Mag., [92] T. M. Getu, W. Ajib, and R. Landry, ‘‘A simple F–test based spectrum
Assoc. Advancement Artif. Intell., vol. 18, no. 3, pp. 11–20, fall 1997. sensing technique for MIMO cognitive radio networks,’’ in Proc. 14th
[73] T. M. Mitchell, Machine Learning, 1st ed. New York, NY, USA: Int. Conf. Wireless Mobile Comput., Netw. Commun. (WiMob), Limassol,
McGraw-Hill, 1997. Cyprus, Oct. 2018, pp. 1–8, doi: 10.1109/WiMOB.2018.8589123.
[74] G. Guo, H. Wang, D. Bell, Y. Bi, and K. Greer, ‘‘KNN model-based [93] G. Alnwaimi, S. Vahid, and K. Moessner, ‘‘Dynamic heterogeneous learn-
approach in classification,’’ in Proc. OTM Confederated Int. Conf. Berlin, ing games for opportunistic access in LTE-based macro/femtocell deploy-
Germany: Springer, 2003, pp. 986–996. ments,’’ IEEE Trans. Wireless Commun., vol. 14, no. 4, pp. 2294–2308,
[75] C. Jiang, H. Zhang, Y. Ren, Z. Han, K.-C. Chen, and L. Hanzo, ‘‘Machine Apr. 2015.
learning paradigms for next-generation wireless networks,’’ IEEE Wireless [94] G. Han, L. Xiao, and H. V. Poor, ‘‘Two-dimensional anti-jamming com-
Commun., vol. 24, no. 2, pp. 98–105, Apr. 2017. munication based on deep reinforcement learning,’’ in Proc. IEEE Int.
[76] Y. Tang, Y.-Q. Zhang, N. V. Chawla, and S. Krasser, ‘‘SVMs modeling Conf. Acoust., Speech Signal Process. (ICASSP), New Orleans, LA, USA,
for highly imbalanced classification,’’ IEEE Trans. Syst., Man, Cybern., B, Mar. 2017, pp. 2087–2091, doi: 10.1109/ICASSP.2017.7952524.
Cybern., vol. 39, no. 1, pp. 281–288, Feb. 2009. [95] A. S. G. Spantideas, C. Tsinos, and P. Trakadas, ‘‘Power control in 5G
[77] A. M. Prasad, L. R. Iverson, and A. Liaw, ‘‘Newer classification and heterogeneous cells considering user demands using deep reinforcement
regression tree techniques: Bagging and random forests for ecological learning,’’ in Proc. Int. Conf. Artif. Intell. Appl. Innov. (IFIP). Creta,
prediction,’’ Ecosystems, vol. 9, no. 2, pp. 181–199, 2006. Greece: Springer, 2021, pp. 95–105.

VOLUME 10, 2022 83527


I. A. Bartsiokas et al.: ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey

[96] Q. Qi, A. Minturn, and Y. Yang, ‘‘An efficient water-filling algorithm for [116] A. Alkhateeb, ‘‘DeepMIMO: A generic deep learning dataset for mil-
power allocation in OFDM-based cognitive radio systems,’’ in Proc. Int. limeter wave and massive MIMO applications,’’ in Proc. Inf. Theory Appl.
Conf. Syst. Informat. (ICSAI), Yantai, China, May 2012, pp. 2069–2073, Workshop (ITA), San Diego, CA, USA, Feb. 2019, pp. 1–8.
doi: 10.1109/ICSAI.2012.6223460. [117] L. Xiao, D. Jiang, D. Xu, H. Zhu, Y. Zhang, and H. V. Poor, ‘‘Two-
[97] H. Baligh, M. Hong, W.-C. Liao, Z.-Q. Luo, M. Razaviyayn, M. Sanjabi, dimensional antijamming mobile communication based on reinforcement
and R. Sun, ‘‘Cross-layer provision of future cellular networks: learning,’’ IEEE Trans. Veh. Technol., vol. 67, no. 10, pp. 9499–9512,
A WMMSE-based approach,’’ IEEE Signal Process. Mag., vol. 31, no. 6, Oct. 2018.
pp. 56–68, Nov. 2014.
[98] N. Naderializadeh, J. J. Sydir, M. Simsek, and H. Nikopour, ‘‘Resource
management in wireless networks via multi-agent deep reinforcement IOANNIS A. BARTSIOKAS was born in Athens,
learning,’’ IEEE Trans. Wireless Commun., vol. 20, no. 6, pp. 3507–3523, Greece, in 1997. He received the M.Eng. degree
Jun. 2021. in electrical and computer engineering (ECE)
[99] M. Guan, Z. Wu, Y. Cui, X. Cao, L. Wang, J. Ye, and B. Peng, from the National Technical University of Athens
‘‘An intelligent wireless channel allocation in HAPS 5G communication (NTUA), in 2020, where he is currently pursuing
system based on reinforcement learning,’’ EURASIP J. Wireless Commun. the Ph.D. degree with the School of Electrical
Netw., vol. 2019, no. 1, pp. 1–9, Dec. 2019.
and Computer Engineering. His research interests
[100] Z. Ning, X. Wang, J. J. Rodrigues, and F. Xia, ‘‘Joint computation include wireless networks, radio resource manage-
offloading, power allocation, and channel assignment for 5G-enabled traf-
ment in 5G/B5G networks, machine learning and
fic management systems,’’ IEEE Trans. Ind. Informat., vol. 15, no. 5,
deep learning in wireless systems, adaptive modu-
pp. 3058–3067, May 2019.
lation, and MIMO antennas. He is also member of the Technical Chamber of
[101] S. Wang, X. Zhang, Y. Zhang, L. Wang, J. Yang, and W. Wang,
Greece.
‘‘A survey on mobile edge networks: Convergence of computing, caching
and communications,’’ IEEE Access, vol. 5, pp. 6757–6779, 2017.
[102] J. Poderys, M. Artuso, C. M. O. Lensbøl, H. L. Christiansen, and J. Soler,
‘‘Caching at the mobile edge: A practical implementation,’’ IEEE Access,
PANAGIOTIS K. GKONIS received the Diploma
vol. 6, pp. 8630–8637, 2018. degree in electrical and computer engineer-
[103] T.-V. Nguyen, N.-N. Dao, V. D. Tuong, W. Noh, and S. Cho, ‘‘User- ing (ECE) and the Ph.D. degree from the ECE
aware and flexible proactive caching using LSTM and ensemble learning in School, National Technical University of Athens
IoT-MEC networks,’’ IEEE Internet Things J., vol. 9, no. 5, pp. 3251–3269, (NTUA), in 2005 and 2009, respectively. He is an
Mar. 2022. Assistant Professor with the Department of Digi-
[104] A. M. Elbir, ‘‘CNN-based precoder and combiner design in mmWave tal Industry Technologies, National and Kapodis-
MIMO systems,’’ IEEE Commun. Lett., vol. 23, no. 7, pp. 1240–1243, trian University of Athens (NKUA). He is an
May 2019. author/coauthor of more than 60 publications in
[105] S. Khalid, W. B. Abbas, and F. Khalid, ‘‘Deep learning based joint pre- the areas of wireless cellular networks and broad-
coder design and antenna selection for partially connected hybrid massive band communications. He has also participated in various national and
MIMO systems,’’ 2021, arXiv:2102.01495. European funded projects.
[106] C. Zhang, Y. Xie, H. Bai, B. Yu, W. Li, and Y. Gao, ‘‘A survey on federated
learning,’’ Knowl.-Based Syst., vol. 216, Mar. 2021, Art. no. 106775.
[107] G. Zhu, Y. Wang, and K. Huang, ‘‘Broadband analog aggregation for low- DIMITRA I. KAKLAMANI was born in Athens,
latency federated edge learning,’’ IEEE Trans. Wireless Commun., vol. 19, Greece, in 1965. She received the Ph.D. degree
no. 1, pp. 491–506, Jan. 2020. in electrical and computer engineering (ECE)
[108] J. Leng, Z. Lin, M. Ding, P. Wang, D. Smith, and B. Vucetic, ‘‘Client from the National Technical University of Athens
scheduling in wireless federated learning based on channel and learning (NTUA), in 1992. In April 1995, April 2000,
qualities,’’ IEEE Wireless Commun. Lett., vol. 11, no. 4, pp. 732–735, October 2004, and February 2009, she was elected
Apr. 2022. as a Lecturer, an Assistant Professor, an Associate
[109] L. Zhao, H. Xu, J. Wang, Y. Chen, X. Chen, and Z. Wang, Professor, and a Professor, respectively, with the
‘‘Computation–communication resource allocation for federated learning School of ECE, NTUA. She has over 300 publi-
system with intelligent reflecting surfaces,’’ Arabian J. Sci. Eng., vol. 2022, cations in the fields of software development for
pp. 1–7, Jan. 2022. information transmission systems modeling, microwave networks, mobile
[110] A. Iqbal, M.-L. Tham, and Y. C. Chang, ‘‘Double deep Q-network-based and satellite communications, and has coordinated the NTUA activities in
energy-efficient resource allocation in cloud radio access network,’’ IEEE the framework of several EU and National projects in the same areas. She
Access, vol. 9, pp. 20440–20449, 2021. is the Editor of one international book by Springer–Verlag (2000) in applied
[111] F. Mungari, ‘‘An RL approach for radio resource management in the CEM and a reviewer for several IEEE journals.
O-RAN architecture,’’ in Proc. 18th Annu. IEEE Int. Conf. Sens., Com-
mun., Netw. (SECON), Jul. 2021, pp. 1–2.
[112] A. Narayanan, E. Ramadan, R. Mehta, X. Hu, Q. Liu, R. A. K. Fezeu, IAKOVOS S. VENIERIS has been a Professor
U. K. Dayalan, S. Verma, P. Ji, T. Li, F. Qian, and Z.-L. Zhang, ‘‘Lumos5G:
with the School of Electrical and Computer
Mapping and predicting commercial mmWave 5G throughput,’’ in Proc.
Engineering, National Technical University of
ACM Internet Meas. Conf., New York, NY, USA, Oct. 2020, pp. 176–193,
doi: 10.1145/3419394.3423629.
Athens (NTUA), since 1994, and the Direc-
tor of Intelligent Communications and Broad-
[113] D. Minovski, N. Ogren, C. Ahlund, and K. Mitra, ‘‘Throughput prediction
using machine learning in LTE and 5G networks,’’ IEEE Trans. Mobile band Networks (ICBNet) Laboratory. He has
Comput., early access, Jul. 26, 2021, doi: 10.1109/TMC.2021.3099397. over 350 publications in the above areas and has
[114] L. Alho, A. Burian, J. Helenius, and J. Pajarinen, ‘‘Machine learning received several national and international awards
based mobile network throughput classification,’’ in Proc. IEEE Wireless for academic achievement. He has participated in
Commun. Netw. Conf. (WCNC), Nanjing, China, Mar. 2021, pp. 1–6, doi: and has successfully led several national and inter-
10.1109/WCNC49053.2021.9417365. national projects. His research interests include distributed systems, security
[115] A. Sharma, S. Pandit, and S. R. Talluri, ‘‘A comparative study to classify and privacy, software and service engineering, agent technology, multimedia,
and predict the throughput of fifth generation wireless technology using mobile communications and machine learning, internetworking, signaling,
supervised machine learning algorithms,’’ in Proc. 6th Int. Conf. Image resource scheduling and allocation for network management, modeling,
Inf. Process. (ICIIP), Himachal Pradesh, India, Nov. 2021, pp. 288–292, performance evaluation, and queuing theory.
doi: 10.1109/ICIIP53038.2021.9702678.

83528 VOLUME 10, 2022

You might also like