A_Hybrid_machine_learning_based_model_for_congestion_prediction_in_mobile_networks
A_Hybrid_machine_learning_based_model_for_congestion_prediction_in_mobile_networks
2022 IEEE 33rd Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) | 978-1-6654-8053-6/22/$31.00 ©2022 IEEE | DOI: 10.1109/PIMRC54779.2022.9977541
Abstract—Congestion avoidance in radio access networks en- have shown that both models have the ability to forecast
hances considerably the end-user Quality of Service (QoS). Con- the network behavior. Nevertheless, cell congestion is defined
gestion should be predicted in advance to allow Self Organizing based on the average download speed per user and does not
Networks (SON) algorithms to perform appropriate parameter
adjustments (such as handover parameters for mobility load take into account typical congestion criteria used for network
balancing). For this purpose, a novel hybrid model efficient optimization such as cell load, average number of active users
congestion prediction mechanism is proposed in this paper. This and traffic volume. In our work, we consider congestion rules
hybrid learning model combines unsupervised and supervised and thresholds used by operational teams to decide on the
learning algorithms. The unsupervised learning consists of a co- congestion state of a cell. In [2], authors developed unsuper-
clustering algorithm based on Latent Block Model (LBM) that
groups similar cells according to their KPIs behaviour over time. vised clustering approaches for features extracted from KPIs to
Following the co-clustering model, a logistic regression approach group cells that show similar performances and, consequently,
is applied on each cluster to predict congestion and alert to identify the groups that perform below the desired threshold.
operators to avoid congestion occurrence in mobile networks. However, only two groups of cells are identified using different
The applicability of the hybrid model is validated for a real clustering algorithms: one group consisting of the cells with
data represented by Key Performance Indicators (KPIs) collected
periodically for 12 days in a live Long-Term Evolution (LTE) the highest performance and the other one containing the
network. The hybrid proposed model has proven its efficiency in worst performing cells. The clustering approach used should
congestion prediction in terms of accuracy, precision, recall and be adapted to form clusters with more specific behaviours or
F-measure. performances. We can also mention other recent works that
Index Terms—Congestion prediction, Radio access network, predict congestion in mobile networks [3]. Authors considered
Co-clustering model, Hybrid learning model, functional logistic
regression. different supervised learning techniques for congestion predic-
tion: linear regression, logistic regression and random forest.
The performance of the different approaches are compared in
I. INTRODUCTION
terms of computational complexity and prediction accuracy.
Due to the traffic demand increase and the emergence Logistic regression using functional data presented the best
of several new services and technologies, mobile networks trade off between accuracy and complexity [3]. Therefore, we
become highly complex and their management lead to growing use the logistic regression approach to predict congestion that
challenges facing the network operators. Therefore, operators outperforms the state-of-art baselines and presents lower com-
have to provide high quality of service (QoS) to the customers plexity compared with other methods such as deep learning
while reducing the operational costs. Automation, through the [4].
introduction of Self Organizing Networks (SON), has been In order to enhance the precision of the congestion prediction,
widely adopted for radio access network management. In the we propose to precede the prediction model by an unsuper-
recent years, artificial intelligence have gained momentum as vised co-clustering approach to form clusters of cells with
it is considered as the further step beyond automation. similar behaviours. This proposed hybrid model increases the
This paper proposes a model that learns historic Key Per- precision metric of the congestion prediction model based
formance Indicators (KPIs) measurements and predicts future on logistic regression. Firstly, an unsupervised co-clustering
congestion in mobile networks. This model alerts operators approach is used to form clusters of cells and time inter-
of future radio congestion and permits to act in a proac- vals where KPIs are highly correlated. As we have in each
tive manner by avoiding congestion before it occurs, which cluster similar behaviours and, consequently, KPIs that are
significantly enhances the QoS. Despite the great interest in highly correlated, we apply the logistic regression approach
prediction techniques, few works tackled congestion prediction used in [3] for each cluster. This results in good congestion
in mobile networks. In [1], two supervised learning algo- prediction performance with relatively low complexity. The
rithms based on forecast machine learning models are used congestion criterion corresponds also to field engineering rules
and evaluated to show their effectiveness in forecasting the used by network operational teams. The added value of the
average downlink throughput of LTE base stations. The results co-clustering approach compared with simple supervised ap-
584
Authorized licensed use limited to: University of Technology Sydney. Downloaded on February 26,2025 at 00:14:24 UTC from IEEE Xplore. Restrictions apply.
It means that a subset of rows exhibits similar behavior across
a subset of columns and vice-versa. Differently from clustering
techniques like K-means that can ignore the functional nature
of data (which can reduce the clustering performance) and
that are limited to one dimension clustering possibility [10],
we use the LBM co-clustering model based on the work in
[10] to form clusters that exhibit similar cells based on their
behaviors across a subset of times.
585
Authorized licensed use limited to: University of Technology Sydney. Downloaded on February 26,2025 at 00:14:24 UTC from IEEE Xplore. Restrictions apply.
A. Future congestion prediction using only logistic regression
approach.
Congestion prediction is done using the labeled dataset
of measurements described above. The logistic regression is
adopted to predict congestion based on the past heterogeneous
KPIs measurements for different prediction horizons h as
shown in Fig. 3. Results presented in Fig. 3 show the
TABLE II
D ESCRIPTION OF DATASET BEFORE AND AFTER F UNCTIONAL LBM
CO - CLUSTERING APPROACH .
586
Authorized licensed use limited to: University of Technology Sydney. Downloaded on February 26,2025 at 00:14:24 UTC from IEEE Xplore. Restrictions apply.
shown in Table III, the cluster 5 is the group of non congested Clusters with high congestion rate present the 70% of the
cells all the time, the cells in this cluster are eliminated out total number of congestion in the initial dataset. Clusters with
of prediction model. Table III shows a very low precision to High Congestion rate can be analysed as one group named HC-
detect congestion for a 30 min perdition horizon in the case cluster and the result performance obtained by the prediction
of clusters with low congestion rate. Therefore, it is better model in terms of diversity metrics: recall, precision and F1
to eliminate these cells from prediction model. To illustrate are improved and illustrated in Fig. 5. Fig. 5 shows that the
TABLE III
P ERFORMANCE FOR CLUSTERS WITH LOW CONGESTION RATE FOR 30 MIN
PREDICTION HORIZON .
587
Authorized licensed use limited to: University of Technology Sydney. Downloaded on February 26,2025 at 00:14:24 UTC from IEEE Xplore. Restrictions apply.
2) Unsupervised co-clustering approach based on LBM to
regroup similar cells in terms of behaviors, 3) supervised
logistic regression classification model for congestion applied
to clusters and 4) Prediction future congestion in mobile
networks. The model results in performance enhancement
ensures in terms of prediction congestion in mobile networks
through a real data application on LTE networks. Besides,
the supervised logistic regression approach applied in clusters
of cells with similar behaviors obtained by the co-clustering
approach improves the quality of congestion prediction in
mobile networks compared with the logistic regression ap-
Fig. 6. Hybrid proposed model result summary. proach applied in heterogeneous cells with different behaviors.
This work inspires exciting directions for future research. The
problem of congestion prediction in radio access networks
rate, Average congestion rate and Low congestion rate or could be further extended by using deep learning neural
without congestion. For the high congestion rate category that networks and be compared with the proposed hybrid model in
represents the 70% of congestion cases in the initial dataset, this article. Another improvement for our study is to introduce
the performance of the prediction model are improved. For other KPIs as latency that can affect the QOS in mobile
average congestion rate category, the prediction model keeps networks. Moreover, an application of the model on other data
the same performances. For low congestion rate category that extracted from other cities and for other periods of time could
presents less than 2 % of initial dataset’s congestion, cells be interesting to validate the flexibility of the proposed hybrid
that belong to this category are eliminated from the prediction model.
model. In fact, for this kind of cells it is impossible to
predict the congestion and the model requires the operator to R EFERENCES
correct uncongested cells by generating wrong alarms (False [1] P. Torres, P. Marques, H. Marques, R. Dionı́sio, T. Alves , L. Pereira
and J. Ribeiro, “Data analytics for forecasting cell congestion on
positive congestion). So, we recommend for these cells not LTE networks,” in 2017 Network Traffic Measurement and Analysis
to apply a prediction model since it will generate many more Conference (TMA), Dublin, 2017, pp. 1–6.
false positives than true positives. Thus, the correction of the [2] R. Santos, M. Sousa, P. Vieira, M. P. Queluz and A. Rodrigues, “An
Unsupervised Learning Approach for Performance and Configuration
problems will be carried out in a reactive manner. Finally, Optimization of 4G Networks,” in 2019 IEEE Wireless Communications
the hybrid model proposed in this paper succeeds to regroup and Networking Conference (WCNC), Marrakesh, Morocco, 2019, pp.
cells based on their behaviors. This model can be used as 1–6.
[3] I. Hadj-Kacem, S. Ben Jemaa, S. Allio and Y. Ben Slimen, “Anomaly
a preventive maintenance tool which makes it possible to prediction in mobile networks : A data driven approach for machine
predict failures and avoid degradation of the quality of service learning algorithm selection,” in 2020 IEEE/IFIP Network Operations
of radio networks. For the last class of cluster, we give a and Management Symposium (NOMS 2020), Budapest, Hungary, 2020,
pp. 1–7.
recommendation that prevents the mobile network operator [4] K. Ghosh, C. Bellinger, R. Corizzo, B. Krawczyk and N. Japkowicz,“On
from intervening to correct false alarms. Note that, when the the combined effect of class imbalance and concept complexity in deep
operational team corrects unexisting congestion or a cell, this learning,” in 2021 IEEE International Conference on Big Data (Big
Data), Orlando, FL, USA, 2021, pp. 4859-4868.
means that the users of the supposedly congested cell will be [5] B. M. Coronado, U. Mori, A. Mendiburu and J. Miguel-Alonso, “Survey
forced to perform a handover. Hence, the quality of service of Network Intrusion Detection Methods from the Perspective of the
for these users deteriorates and the risk of call cuts increases. Knowledge Discovery in Databases Process,” IEEE Transactions on
Network and Service Management, 2020.
To illustrate the advantages of cells with low congestion rate [6] S. Kyu Kwak and J. Hae Kim, “Statistical data preparation: management
elimination from prediction model, we analyse the confusion of missing values and outliers,” Korean journal of anesthesiology, vol.
matrix for two different clusters with low congestion rate and it 70, no. 4, pp. 407–411, 2017.
[7] Y. Ben Slimen, S. Allio and J. Jacques, “Anomaly Prevision in Radio
confirms that the prediction model has not the ability to predict Access Networks Using Functional Data Analysis,” in GLOBECOM
them. However, the prediction model requires the operator to 2017 - 2017 IEEE Global Communications Conference, Singapore,
correct uncongested cells. This analysis shows that FunLBM 2017, pp. 1–6.
[8] Y. Ben Slimen, S. Allio, J. Jacques, “Model-based co-clustering for
co-clustering approach allows us to find these cells that we functional data,” Neurocomputing, vol. 291, pp. 97–108, 2018.
can not predict their uncertain abnormalities in clusters and [9] G. Govaert and M . Nadif, Co-Clustering: Co-Clustering: Models,
throw them out of the prediction model. Algorithms and Applications, Wiley-ISTE, 2013.
[10] C. Bouveyron, L. Bozzi, J. Jacques and F. Xavier Jollois, “The
Functional Latent Block Model for the Co-Clustering of Electricity
V. CONCLUSION Consumption Curves,” Journal of the Royal Statistical Society: Series
This paper proposes a hybrid congestion prediction model C Applied Statistics, Wiley, In press, vol. 67, no. 4,pp. 897–915, 2018.
[11] B. Juba, H. S. Le, “Precision-Recall versus Accuracy and the Role
based on supervised and unsupervised approaches to enhance of Large Data Sets,” The Thirty-Third AAAI Conference on Artificial
the congestion in LTE networks by observing key performance Intelligence (AAAI-19), 2019.
indicators. This model consists of four essential steps: 1)
Smoothing step to transform discrete data to functional data,
588
Authorized licensed use limited to: University of Technology Sydney. Downloaded on February 26,2025 at 00:14:24 UTC from IEEE Xplore. Restrictions apply.