Performance assessment of time series forecasting models for simple network management protocol-based hypervisor data
Performance assessment of time series forecasting models for simple network management protocol-based hypervisor data
Corresponding Author:
Rendy Munadi
School of Computing, Faculty of Informatic, Telkom University
40257 Buah Batu, Bandung, Indonesia
Email: [email protected]
1. INTRODUCTION
The field of cloud computing has experienced rapid growth, with numerous virtual machines now
operating within physical server environments [1]. As demand for virtualization technology increases,
effective time-series forecasting methods are essential, particularly for managing simple network management
protocol (SNMP)-based hypervisor data [2]. Such data supports tracking and managing performance across
virtual instances deployed in the cloud. However, time-series forecasting has certain weaknesses when dealing
with 'non-stationary' data, as it often struggles to model dynamic and volatile workloads. This challenge limits
the adaptability of forecasting methods when faced with unpredictable, fluctuating workloads, especially in
SNMP-based hypervisor contexts, where these limitations pose significant obstacles to system administrators
seeking to optimise resource management in cloud data centers [3].
Accurate workload prediction in cloud computing is crucial for optimising resource allocation and
operational efficiency. Forecasting methods are generally divided into two main categories: those based on
fundamental time-series dynamics and those using machine learning and artificial neural networks [4].
A range of methods has been developed to address workload prediction challenges, including classical
statistical approaches such as autoregressive (AR), moving average (MA), and the autoregressive integrated
moving average (ARIMA) model [5], as well as more sophisticated models such as seasonal autoregressive
integrated moving average (SARIMA) [6], [7]. Additionally, contemporary techniques like multiple linear
regression (MLR), ridge regression (RR) [8], and adaptive neuro-fuzzy inference systems (ANFIS) [9] have
also been applied. Recent research by Kumar and Singh [10] have indicated that auto ARIMA is a promising
approach, offering high prediction accuracy for web server workloads and demonstrating its potential to
proactively optimise resource allocation. However, traditional models such as auto ARIMA often struggle
with the complexity and dynamics of cloud data centre environments, characterised by rapidly evolving data
variability. This limitation is particularly evident over longer forecasting horizons and during periods of
significant data fluctuation, underscoring the need for models that are more adaptive to changing conditions.
In recent years, automated machine learning (AutoML) frameworks, such as PyCaret, have emerged
as powerful tools to simplify model selection, evaluation, and optimisation. PyCaret is a software tool that
integrates a range of machine learning algorithms and time-series forecasting models within a Python
wrapper, which includes assemblies of several machine learning frameworks, such as scikit-learn, XGBoost,
LightGBM, CatBoost, spaCy, Optuna, Hyperopt, and Ray. Studies indicate that PyCaret’s capabilities
significantly enhance implementation, versatility, and customisation [11]. This tool has proven influential in
time-series analysis and forecasting across multiple contexts [12]. Its applications range from forecasting
trends in the COVID-19 pandemic [13] to data collection and formatting processes for various forecasting
projects [14]. However, the potential for hypervisor management in the cloud computing domain with actual
and synthetic SNMP-based hypervisor data still needs to be explored.
This research aims to evaluate the efficacy of 30 distinct time-series forecasting models using
PyCaret on variety of authentic and synthetic datasets derived from SNMP-based hypervisor systems. The
objective is to leverage the PyCaret toolkit to identify optimal forecasting methodologies for predicting
hypervisor components accurately, specifically CPU utilization, memory utilization, and the number of disk
reads. The findings of this study will support organisations in making informed decisions regarding resource
allocation and overall operational efficiency in cloud environments. The primary contributions of this paper
are as follows: first, the application of time-series forecasting models to SNMP hypervisor data through
PyCaret, followed by analysis of the time series windows and forecasting models that yielded the best results;
second, an evaluation of resource management approaches in light of the integrated forecasting measures.
2. METHOD
This study employs a machine learning approach to evaluate the efficacy of a time series forecasting
model utilizing SNMP-based hypervisor data. The research was conducted using the stages illustrated in
Figure 1. This section will discuss the dataset, experimental design, and model evaluation employed in this
study. Furthermore, Figure 1 illustrates the essential steps for utilizing Pycaret to evaluate the model. The
framework outlines procedures for acquiring load data from synthetic and real-time databases, providing an
overview of the Pycaret interface for model evaluation. This systematic approach facilitates a comprehensive
understanding of the workflow and tasks designed to ensure practical evaluation and comparison of multiple
forecasting models in performance.
2.1. Dataset
This research employed a time series dataset, as detailed in Table 1 of the research paper. The
dataset is divided into two principal categories: synthesis data, which is simulated data, and real-time data,
which is obtained directly from the source. This information is crucial for accurately interpreting the data
analyzed in the study and verifying the forecasting models [15].
‒ Synthesis: the scenario-based system generated a dataset containing CPU load, memory usage, and small
computer system interface (SCSI) disk bytes as variables collected at different times.
‒ Real-time: a few of the datasets available include CPU utilization, memory usage, and SCSI disk bytes
obtained for one month, one week, and one day, respectively.
Synthetic data is associated with a robust correlation to elevated prediction accuracy, as the
generation of synthetic data adheres to a desired pattern. In contrast, data from real-time sources typically
exhibit a weaker correlation with forecasting accuracy, as numerous external factors beyond an individual's
control can influence their nature. The following characteristics are present in the dataset: The initial step is
to evaluate CPU utilization for the monitored device by examining the CPU idle value. A low CPU idle value
may indicate that the device operates under a high load. Secondly, to assess the memory load of the
monitored device, a low memory-free value has the same significance, whereby a high load may be placed on
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1152 ISSN: 2252-8938
the device. Thirdly, through SCSI disk bytes, to assess the number of bytes written to the SCSI disk on the
monitored device, a high byte count may suggest that the device is busy.
Equation for time-series analysis of data in SNMP is described by the variable Ý, which is a
collection of workload values over time t, where 𝑦𝑡 is the workload at time t. Function ƒ represents the
previous workload analysis used to estimate outcome of future events. This equation can be expressed as (1):
Where Ý𝑡+1 is represents the predicted workload at the next time step (t+1), ƒ is a function that uses the
previous workload values to estimate the future workload1, and 𝑦𝑡, 𝑦𝑡−1,…,𝑦1 are the workload values at
the current and previous time steps.
This approach is pertinent in the context of SNMP, where data collection is continuous. It is
noteworthy for its capacity to illustrate the evolution of the network system and the monitored devices with
greater clarity. However, it is important to recognise a potential limitation, namely that the predicted values
(Ý𝑡+1) must be validated against the actual values. This can be achieved by measuring the forecast error (𝑒𝑡+1 ).
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1154 ISSN: 2252-8938
This research utilises the same dataset employed by Kumar and Singh [10] to facilitate a comparison
of the findings with the results of other literature. This comprises a real-world dataset of web server traces,
namely the number of hyper text transfer protocol (HTTP) requests for the NASA server, Calgary server, and
Saskatchewan server, which is used to predict web server workload. The experiments are performed with a 5,
10, 20, 30, and 60-minute duration period window (PWS).
TCP/IP (UDP)
2.3. Evaluation
2.3.1. Time series forecasting models
PyCaret is a machine learning framework that does not require the use of supplements and is
straightforward to use and functional, thereby making it accessible to those new to the discipline [17].
The framework assists users at each stage of the machine learning process, from data preparation to model
analysis and execution. The primary objective of analysing time-series data is to identify trends, which are
captured by pertinent statistics and characteristics of the data being handled. There are various methods of
forecasting time series, and this research utilised 30 models from the Pycaret library, as presented in Table 3.
Various studies have utilized PyCaret, a machine learning library, in various applications. These
include diabetes classification and prediction [18], intrusion detection system performance analysis on the
UNSW-NB15 dataset [19], hyperparameter tuning for image classification [20], and AutoML
implementation on the PowerBI application [21]. PyCaret was also used in the evaluation of the predictive
power of multiple regression models for groundwater contamination [22], intrusion detection using
supervised and unsupervised learning methods on the Cicids 2017 dataset [23], machine learning-based
network-based intrusion detection system (NIDS) analysis and modelling for IoT networks [24], and URL
detection for phishing websites [25]. This research demonstrates the versatility and capabilities of PyCaret in
various fields and contexts.
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1156 ISSN: 2252-8938
‒ Azure data factory transformed: data transformed with Azure data factory transformation.
‒ Stationarity: conclusion whether the data is stationary.
‒ p-value: the p-value for the ADF test.
‒ ADF transformed test statistics: transformed ADF test statistics.
‒ Critical value: critical value for the ADF test at a certain significance level.
‒ Trend: whether there is a trend in the data.
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1158 ISSN: 2252-8938
followed by 'linear w/ cond. Deseasonalize and detrending' and 'ridge w/ cond. Deseasonalize and detrending'
ranked 2nd and 3rd respectively.
From the results listed in Table 6, it can be seen that the grand means forecaster model on the daily
dataset has the lowest RMSE and the 4th rank, indicating good prediction performance. On the monthly
dataset, the naive forecaster model has the lowest RMSE but is ranked lower. On the synthetic dataset, the
AdaBoost w/ cond. Deseasonalize and detrending ranked first with the best performance. This information is
essential for selecting the best model for forecasting the availability of unused CPUs to help plan and manage
system resources efficiently. Overall, the daily dataset and the grand means forecaster model with lower
RMSE, MASE, and MAE and close to the synthetic ranking value show better CPU forecasting performance.
Overall, the daily dataset and naive forecaster model were the best performing datasets and models.
This model achieved the lowest rank among all tested models, indicating a smaller prediction error rate and
more accurate performance. The analysis of this table provides a deeper understanding of which models are
most effective in predicting disk read rates at various data frequencies. However, the synthetic dataset had the
highest rank because it had a large prediction residual value compared with the other datasets, as shown in
Figure 5.
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1160 ISSN: 2252-8938
Our results align with existing research on the effectiveness of framework for comparing accuracy
of time-series forecasting methods. Similar to our findings, a study by Hyndman and Athanasopoulos [30]
also identified gradient boosting had the highest accuracy in gradient boosting, which is often used in
machine learning competitions, although it was lower than naive, a simple prediction method. This reinforces
the notion that gradient boosting ability to handle complex relationships within data can be advantageous for
resource load prediction.
The gradient boosting technique was proposed as a one-of-a-kind applied gradient boosting
machine, particularly for regression and classification trees. The “boosting” concept is the root of gradient
boosting, which merges the forecasting of weak learners with additive training methods to develop a strong
learner [31]. Gradient boosting with conditional deseasonalization and detrending is a specific application of
gradient boosting that incorporates the removal of seasonal and trend components from time-series data
before model training [32].
The findings of this study further indicate that PyCaret, as an AutoML framework, simplifies the
process of selecting optimal models and tuning hyperparameters, essential for practical use in cloud data
centres. This contrasts with the manual approach in previous studies, which required model selection and
testing to be conducted individually. With AutoML capabilities, processing time and computational resources
can be optimised, enabling practitioners to efficiently select the best model without repeated adjustments to
model parameters, as discussed in the research by Westergaard et al. [12]. This information is essential for
selecting a model that is accurate in prediction and efficient in using computing resources.
Testing on the Calgary server dataset involved the same models and settings, as shown in Table 11.
Auto ARIMA, RMSE, and MAE values increase with longer prediction windows, though it achieves a low
MASE value of 1.51 at a 10 minutes window, indicating optimal medium-term fit. The gradient boosting
model with deseasonalisation and detrending achieves lower RMSE and MAE at the shortest prediction
window of 5, underscoring its high accuracy. However, its RMSE and MAE values slightly increase over
longer windows, while remaining competitive.
A comparative analysis using the Saskatchewan server dataset similarly assessed the efficacy of Auto
ARIMA and gradient boosting across four prediction windows, as shown in Table 12. Auto ARIMA, RMSE,
and MAE values rise with window length, though MASE remains stable around 1.9, indicating consistent
performance. Gradient boosting with deseasonalisation and detrending outperforms Auto ARIMA across
almost all prediction windows, achieving significant accuracy improvements in the 5 and 60 minutes windows.
In terms of overall accuracy and efficiency, gradient boosting consistently outperforms Auto
ARIMA across all datasets and prediction windows, demonstrating a superior ability to capture complex
patterns and manage variability in web server workloads. Although Auto ARIMA performs adequately in
medium-term predictions, its effectiveness diminishes in longer timeframes, as indicated by rising RMSE and
MAE values, revealing limitations in adapting to dynamic changes. Gradient boosting, on the other hand,
demonstrated greater scalability and adaptability at varying time scales, maintaining lower error metrics for
both short- and long-term predictions. Its ability to integrate deseasonalisation and detrending enhances
predictive accuracy, making it well-suited for datasets with inherent seasonality and trends. These findings
suggest that future research could focus on refining gradient boosting models further, exploring hybrid
methods, and optimising model complexity to balance accuracy with computational efficiency.
4. CONCLUSION
This study presents a performance assessment of time-series forecasting models for SNMP-based
hypervisor data using Pycaret. We evaluated the performance of our method on two types of datasets:
real-time data collected from a physical server environment running multiple virtual machines and synthetic
data created based on a specific workload scenario. We compared 30 different time-series forecasting models
using three evaluation metrics, namely, RMSE, MASE, and MAE, and the computation time required.
The evaluation results obtained the best dataset using daily data with several forecasting models, according to
the resource load on the hypervisor. The best naive forecaster model for forecasting CPU performance with a
runtime of 1.71 seconds and disk read runtime of 1.90 seconds. Meanwhile, gradient boosting w/ cond.
deseasonalize and detrending the best for memory dataset with process times of 0.13. Based on the test
results with NASA, Calgary, and Saskatchewan server datasets, from the comparison between Auto ARIMA
and gradient boosting with deseasonalize & detrending models, while both models have their merits, gradient
boosting with deseasonalize and detrending provides a more robust solution for forecasting in cloud data
centre environments, adapting well to the complexities and dynamics of real-world server workloads.
In the future, researchers can concentrate on enhancing and confirming the effectiveness of our approach by
using more extensive and varied datasets. They can also integrate it with other resource management in
decision support systems for use in real-time live migration settings in hypervisor clusters. Our research has
the potential to improve the quality and efficiency of resource management maintenance in data centres,
which can ultimately enhance the quality of service (quality of service) and service level agreements (SLA).
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)
1162 ISSN: 2252-8938
ACKNOWLEDGEMENTS
The authors gratefully acknowledge the financial support from DRTPM Kemdikbudristek and
Telkom University for this research under the Doctoral Dissertation Research Program scheme 2024, under
Research Contract Number 106/E5/PG.02.00.PL/2024; 043/SP2H/RT-MONO/LL4/2024; 034/LIT07/PPM-
LIT/2024.
REFERENCES
[1] M. C. Martachinnicieneait, “SNMP for cloud environment energy eficiency,” Research Square, pp. 1-27, 2021, doi:
10.21203/rs.3.rs-720622/v1.
[2] S. B. Shaw, C. Kumar, and A. K. Singh, “Use of time-series based forecasting technique for balancing load and reducing
consumption of energy in a cloud data center,” in 2017 International Conference on Intelligent Computing and Control (I2C2),
IEEE, 2017, pp. 1–6, doi: 10.1109/I2C2.2017.8321782.
[3] S. Steiner, “Harnessing data to make better-informed decisions,” Scientia, 2022, doi: 10.33548/SCIENTIA768.
[4] S. Jadon, A. Patankar, and J. K. Milczek, “Challenges and approaches to time-series forecasting for traffic prediction at data
centers,” in 2021 International Conference on Smart Applications, Communications and Networking, SmartNets 2021, IEEE,
2021, pp. 1–8, doi: 10.1109/SmartNets50376.2021.9555422.
[5] H. Jdi and N. Falih, “Comparison of time series temperature prediction with auto-regressive integrated moving average and
recurrent neural network,” International Journal of Electrical and Computer Engineering, vol. 14, no. 2, pp. 1770–1778, Apr.
2024, doi: 10.11591/ijece.v14i2.pp1770-1778.
[6] V. Savchenko et al., “Network traffic forecasting based on the canonical expansion of a random process,” Eastern-European
Journal of Enterprise Technologies, vol. 3, no. 2–93, pp. 60–69, 2018, doi: 10.15587/1729-4061.2018.131471.
[7] W. Yoo and A. Sim, “Time-series forecast modeling on high-bandwidth network measurements,” Journal of Grid Computing,
vol. 14, no. 3, pp. 463–476, 2016, doi: 10.1007/s10723-016-9368-9.
[8] S. N. Wahyuni, E. Sediono, I. Sembiring, and N. N. Khanom, “Comparative analysis of time series prediction model for
forecasting covid-19 trend,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 28, no. 1, pp. 600–610,
Oct. 2022, doi: 10.11591/ijeecs.v28.i1.pp600-610.
[9] P. Shastri, B. R. Dawadi, and S. R. Joshi, “Intelligent approach to switch replacement planning for internet service provider
networks,” Sustainable Futures, vol. 2, 2020, doi: 10.1016/j.sftr.2020.100036.
[10] J. Kumar and A. K. Singh, “Performance assessment of time series forecasting models for cloud datacenter networks’ workload
prediction,” Wireless Personal Communications, vol. 116, no. 3, pp. 1949–1969, 2021, doi: 10.1007/s11277-020-07773-6.
[11] M. Ali, “PyCaret: an open source, low-code machine learning library in python,” ACS Omega, vol. 6, p. 6791−6797, 2021.
[12] G. Westergaard, U. Erden, O. A. Mateo, S. M. Lampo, T. C. Akinci, and O. Topsakal, “Time series forecasting utilizing
automated machine learning (autoML): a comparative analysis study on diverse datasets,” Information, vol. 15, no. 1, 2024, doi:
10.3390/info15010039.
[13] S. Maurya and S. Singh, “Time series analysis of the covid-19 datasets,” in 2020 IEEE International Conference for Innovation in
Technology, INOCON 2020, IEEE, 2020, pp. 1–6, doi: 10.1109/INOCON50539.2020.9298390.
[14] J. Siebert, J. Groß, and C. Schroth, “A systematic review of packages for time series analysis †,” Engineering Proceedings, vol. 5,
no. 1, 2021, doi: 10.3390/engproc2021005022.
[15] V. Cerqueira, L. Torgo, and I. Mozetič, “Evaluating time series forecasting models: an empirical study on performance estimation
methods,” Machine Learning, vol. 109, pp. 1997–2028, 2020, doi: 10.1007/s10994-020-05910-7.
[16] T. Lindgren and O. Steinert, “Low dimensional synthetic data generation for improving data driven prognostic models,” in 2022
IEEE International Conference on Prognostics and Health Management, ICPHM 2022, IEEE, 2022, pp. 173–182, doi:
10.1109/ICPHM53196.2022.9815660.
[17] Y. K. Phua, T. Fujigaya, and K. Kato, “Predicting the anion conductivities and alkaline stabilities of anion conducting membrane
polymeric materials: development of explainable machine learning models,” Science and Technology of Advanced Materials, vol.
24, no. 1, 2023, doi: 10.1080/14686996.2023.2261833.
[18] P. Whig, K. Gupta, N. Jiwani, H. Jupalle, S. Kouser, and N. Alam, “A novel method for diabetes classification and prediction
with pycaret,” Microsystem Technologies, vol. 29, no. 10, pp. 1479–1487, 2023, doi: 10.1007/s00542-023-05473-2.
[19] Abdullah, F. B. Iqbal, S. Biswas, and R. Urba, “Performance analysis of intrusion detection systems using the pycaret machine
learning library on the unsw-nb15 dataset,” B.Sc. Thesis, Department of Computer Science and Engineering, Brac University,
Dhaka, Bangladesh, 2021.
[20] K. Arai, J. Shimazoe, and M. Oda, “Method for hyperparameter tuning of image classification with pycaret,” International Journal
of Advanced Computer Science and Applications, vol. 14, no. 9, pp. 276–282, 2023, doi: 10.14569/IJACSA.2023.0140930.
[21] D. J. C. Sihombing, J. U. Dexius, J. Manurung, M. Aritonang, and H. S. Adinata, “Design and analysis of automated machine
learning (autoML) in powerbi application using pycaret,” in 2022 International Conference of Science and Information
Technology in Smart Administration, ICSINTESA 2022, 2022, pp. 89–94, doi: 10.1109/ICSINTESA56431.2022.10041543.
[22] T. Huynh, H. Mazumdar, H. Gohel, H. Emerson, and D. Kaplan, “Evaluating the predictive power of multiple regression models
for groundwater contamination using pycaret,” in WM2023 Conference, Arizona, USA: WM Symposia, Inc., 2023, pp. 1–15.
[23] S. Krsteski, M. Tashkovska, B. Sazdov, L. Radojichikj, A. Cholakoska, and D. Efnusheva, “Intrusion detection with supervised
and unsupervised learning using pycaret over cicids 2017 dataset,” in Artificial Intelligence Application in Networks and Systems,
2023, pp. 125–132, doi: 10.1007/978-3-031-35314-7_12.
[24] M. Karanfilovska, T. Kochovska, Z. Todorov, A. Cholakoska, G. Jakimovski, and D. Efnusheva, “Analysis and modelling of a
ml-based nids for iot networks,” in Procedia Computer Science, 2022, pp. 187–195, doi: 10.1016/j.procs.2022.08.023.
[25] P. Rani, “PyCaret based url detection of phishing websites,” Turkish Journal of Computer and Mathematics Education
(TURCOMAT), vol. 11, no. 1, pp. 908–915, 2020, doi: 10.17762/turcomat.v11i1.13589.
[26] J. E. Monogan, “Time series analysis,” in Political Analysis Using R, Springer, Cham, 2015, pp. 157–186, doi: 10.1007/978-3-
319-23446-5_9.
[27] M. Mahan, C. Chorn, and A. P. Georgopoulos, “White noise test: detecting autocorrelation and nonstationarities in long time series
after arima modeling,” in Proceedings of the 14th Python in Science Conference, 2015, doi: 10.25080/Majora-7b98e3ed-00f.
[28] S. N. Rao, G. Shobha, S. Prabhu, and N. Deepamala, “Time series forecasting methods suitable for prediction of cpu usage,” in
CSITSS 2019 - 2019 4th International Conference on Computational Systems and Information Technology for Sustainable
BIOGRAPHIES OF AUTHORS
Prof. Rendy Munadi has been a lecturer at the Faculty of Electrical Engineering
at Telkom University (formerly, STT Telkom) Bandung, Indonesia since 1993. The teaching
materials taught include wireless sensor networks, data and protocol networks, broadband
networks, internet of things in the master field of study while in the field of undergraduate
studies include traffic engineering, future new network, and seminar proposals. The field of
research that is being carried out is manufacture of a hemoglobin measuring device by
utilizing the internet of things (IoT) based machine learning method, partner: Hasan Sadikin
Hospital. He has serving experience in universities since 1998 is as Head of Academic
Administration (formerly STT Telkom) and Head of Telecommunication Engineering Study
Program in 2005-2006, served as Vice Chancellor for Academic Affairs in 2006-2010. Since
2018, he has served as an assessor for FTE lecturer certification. Currently, he is a senior
lecturer in the network cyber management (NCM) expertise group. He can be contacted at
email: [email protected].
Performance assessment of time series forecasting models for simple network … (Yuggo Afrianto)