Moni_FinalFPR[1]
Moni_FinalFPR[1]
Cloud computing has become a cornerstone of modern infrastructure, providing scalable and
flexible solutions for businesses. However, managing cloud resources efficiently while
maintaining cost-effectiveness remains a significant challenge. This project aims to optimize
cloud resource allocation using machine learning models and AWS infrastructure. By
leveraging Python-based data analysis and machine learning techniques, such as Isolation
Forest for anomaly detection, the study identifies inefficiencies in cloud resource usage and
suggests optimizations to enhance performance.
The dataset used for analysis consists of cloud resource metrics, including CPU usage,
memory consumption, disk throughput, and network activity. These metrics were
preprocessed through outlier detection and feature scaling to ensure data quality. The models
were trained to predict anomalies and resource demands, providing real-time insights into
cloud infrastructure usage. AWS services, including EC2, S3, and CloudWatch, were
employed for continuous monitoring and data storage, creating a robust framework for
managing cloud resources in real time.
The machine learning models demonstrated high accuracy in detecting anomalies, which
contributed to more efficient cloud resource management. The integration with AWS allowed
for seamless scalability and automation of resource provisioning, reducing operational costs
while improving system performance. The project's contributions include an enhanced
approach to cloud resource management, leveraging AI for anomaly detection and real-time
monitoring, with potential applications for large-scale cloud environments.
Introduction
However, as cloud environments become more complex and workloads fluctuate, managing
resources efficiently becomes increasingly challenging. Cloud service providers (CSPs) must
ensure that their resources, such as CPU, memory, and storage, are adequately allocated to
avoid resource wastage and minimize operational costs. Resource allocation is a critical
function in cloud computing, affecting system performance, energy consumption, and user
satisfaction. Traditional resource allocation strategies often rely on static provisioning, which
can lead to either over-provisioning or under-provisioning of resources, both of which have
negative financial and performance consequences (Al-Asaly, Hassan, & Alsanad, 2019).
In recent years, the integration of artificial intelligence (AI) and machine learning (ML)
techniques, particularly reinforcement learning (RL), has shown great promise in optimizing
resource allocation. Reinforcement learning algorithms, such as Q-learning and Deep Q-
Networks (DQN), enable cloud systems to make intelligent decisions about resource
distribution based on workload patterns and predicted future demands. By learning from past
actions and continuously improving their strategies, RL-based approaches can significantly
enhance cloud resource efficiency (Chen et al., 2021). This study explores how intelligent
resource allocation can be enhanced using these AI techniques, focusing on improving
performance, efficiency, and cost-effectiveness in cloud environments.
Cloud computing has changed the way in which organizations and individuals store, manage,
and process data. It offers the agile and elastic computing platform where resources are
allocated according to their demands for immediate use because it saves organizations from
making huge investments in expensive hardware and infrastructure. The fact that one can
access computing resources immediately from anywhere in the world makes cloud computing
a prime constituent of today's digital economy. According to Belgacem et al. (2022), the
primary reasons behind cloud computing environments, in particular, IaaS are the facts that
they allow users to have powerful computation, storage, and networking, thus being quite
important in small businesses and huge enterprises.
However, with increasingly complex cloud environments and varying workloads in such
environments, it is becoming ever more challenging to manage them in an acceptably
efficient manner. Therefore, keeping the wastage of resources like CPU, memory, and storage
at bay, a CSP might have to manage its resources so aptly that these operational costs are kept
at their lowest. Resource allocation is a classic function in cloud computing, involving system
performance, energy consumption, and user satisfaction directly. Resource allocation based
on traditional methods is mainly static provisioning, resulting in bad over-provisioning or
under-provisioning levels that are not only financially costly but also result in poor system
performance(Al-Asaly, Hassan, & Alsanad, 2019). Contemporary time has also seen the
mainstreaming of AI and ML techniques, such as reinforcement learning in optimizing
resource allocation. For example, the Q-learning and Deep Q-Networks techniques enable
cloud systems to make smart decisions in terms of resource distribution based on workload
patterns and predicted future demands. RL-based approaches can significantly enhance the
efficiency of resources in a cloud by learning from past experiences and improving strategies
accordingly (Chen et al., 2021). This research evaluates how these AI methods can make
intelligent resource allocation even more effective and improve the performance, efficiency,
and cost-effectiveness of cloud computing.
1. Exploring how reinforcement learning algorithms, such as Q-learning and DQN, can
dynamically and effectively allocate cloud resources based on realistic real-time
workload patterns.
2. Predictive model using neural networks for resource demand prediction and resource
distribution optimization of cloud systems with the help of Long Short-Term Memory,
Convolutional Neural Networks, etc.
1. How effective are reinforcement learning algorithms, such as Q-learning and DQN, in
optimizing resource allocation in cloud computing environments?
2. Can neural networks, particularly LSTM and CNN models, accurately predict
resource demand in cloud systems?
4. What are the key factors that influence the success of AI-driven resource allocation in
cloud environments?
Hypotheses:
H2: Neural networks (LSTM, CNN) can accurately forecast patterns of workload and
can optimally distribute the resources for improving the performance of the cloud
system.
H4: The success of AI-driven resource allocation relies on workload variation, the
scalability of the system, and adaptation in real-time.
Data Privacy: The data that has been accrued for this study, including both publicly
available datasets and cloud environments, shall be anonymized to ensure privacy
protection of the users. No personal or sensitive information shall be included in the
dataset, and all analysis will be conducted on aggregated data.
Literature Survey
2.1 Introduction
Cloud computing actually represents a fundamental transformation in the way resources are
managed and distributed over the internet. However, with increasing complexity in their
cloud workloads and dynamic changes in resource requirements, the challenge for CSPs
becomes gigantic. This has led to the development of intelligent resource allocation
strategies, with a high adoption of machine learning (ML) and AI techniques, to overcome
traditional static allocation methods, which generally fail to meet dynamic requirements from
scalable applications in cloud environments. Therefore, this chapter reviews the literature
related to resource-allocation strategies in cloud computing, mainly concerning challenges,
AI-based solutions, and novel methodologies on optimizing the usage of resources in cloud
environments.
Methodology
3.1 Introduction
This chapter describes how resource allocation in the cloud can be improved with the use of
machine learning and reinforcement learning models. The strategy would be to integrate
multiple algorithms: K-Means and Isolation Forest, which would be applied for anomaly
detection. This strategy will, furthermore, involve techniques based on deep learning such as
LSTM models and CNN models. Moreover, it explains its experimentation in simulated
scenarios using OpenAI Gym and actual scenarios based on Amazon Web Services (AWS).
The same principle is the foundation upon which it advances optimizing allocation and
provisioning of resources in the cloud, not only through workload demand prediction and
anomaly detection but also real-time resource provisioning.
CPU cores
These features provide a comprehensive view of cloud resource usage and help train the
machine learning models to make informed decisions on resource allocation.
Imputation techniques replace missing data and inconsistent records in the data set. For large
missing areas, records are dropped; however, for smaller holes, mean or median values for
columns are inserted. This procedure guarantees no critical information is lost and the overall
integrity of the data set maintained.
Z-score-based outlier detection is employed to identify and remove extreme values that
could distort the models' learning process. The formula used for detecting outliers is:
X− μ
Z=
σ
Where:
Z is the Z-score,
Outliers are defined as data points where the absolute Z-score exceeds a threshold of 3. These
outliers are removed to improve model robustness.
3.3.3 Scaling and Normalization
Data scaling is used to bring all features into the same range that none of the feature
dominates the learning process of the model. MinMaxScaler is used on the data to normalize
between 0 and 1:
X − Xmin
Xscaled=
Xmax − Xmin
The heart of the project is the development of machine learning models to make optimization
for dynamic cloud resource allocation. Models developed include:
Reinforcement learning (RL) enables the cloud system to learn optimal resource allocation
strategies based on feedback from the environment. Two primary RL approaches are used:
Q-Learning
Q-learning is a model-free RL algorithm that learns a policy by estimating Q-values for each
state-action pair. The Q-values are updated using the Bellman equation:
Where:
r is the reward,
Q-learning helps the system allocate resources efficiently by learning from the feedback it
receives for each action.
The project uses the predictive analytics through neural network models in order to predict
workload fluctuations and proactively respond appropriately to the resource allocation.
LSTM is an application of a model that can capture temporal dependencies in sequential data,
hence very suitable for predicting workloads by peak hours next periods based on historical
usage patterns of resources. The LSTM cells contain input gates, forget gates, and output
gates that regulate the flow of information while retaining the imperative data for a long
period.
CNNs are used to discover spatial features of the data, such as a correlation between CPU
and memory usages. The CNN architecture consists of convolutional layers that are used to
gather features and pooling layers to reduce dimensions.
The K-Means Clustering algorithm is used to find patterns in the resource usage data. It is an
unsupervised learning algorithm, which clusters data points on the basis of their similarity.
This allows the system to place dissimilar resource usage patterns into each cluster. The
Elbow method is used to find the proper number of clusters,
The K-Means algorithm minimizes the within-cluster sum of squares (WCSS), defined as:
k
WCSS=∑ ∑ ∥ x − μi∥2
i=1 x∈Ci
Where:
x is a data point,
Ci is the i-th cluster,
The Isolation Forest algorithm can be used to identify anomalies in resource usage, which
might point to inefficient utilization of resources or potential systemic problems. The
algorithm applies unsupervised learning to isolate anomalies by recursively partitioning the
data. Such data points are those that substantially deviate from other data points and need
fewer splits for isolation.
The anomaly score of every data point is determined by the average path length from the root
of the tree up to that point; shorter paths reflect the anomaly of the points.
A custom cloud environment is designed using OpenAI Gym wherein there would be
scenarios of resource allocation. The environment mimics real-world conditions of a cloud
infrastructure where workloads vary dynamically, and the RL agent has to take decisions in
real-time to optimize resource allocation.
State Space: This includes CPU usage, memory usage, disk throughput, and network
bandwidth. These metrics demonstrate the state of the cloud system in real time and
are utilized by the RL model for decisions.
Action Space: The agent can scale up, scale down, or keep provision on point for
resources. This affects the provisioning of cloud resources.
Reward Function: At this point, the agent accumulates positive rewards when there
is optimal resource usage and negative rewards when over-provisioning or under-
provisioning occurs.
The environment runs multiple episodes, where the RL agent learns through trial and error,
improving its ability to allocate resources efficiently over time.
3.5.2 AWS Cloud Platform Setup
After verification in the simulated environment, models are deployed in the cloud of AWS so
that their performance can be tested in real-cloud infrastructure.
Amazon EC2: Different types of resource demands are simulated and instances of
EC2 are provisioned where the RL model adjusts dynamically the number of
instances with the configuration of them according to the predictions of workload.
Amazon S3: S3 is used for storing the experimental data and the output of the
models. It provides scalable storage for logs and resource metrics.
The models then could be tested in real time, validating that the algorithms developed will
handle real-time fluctuations in workload and scale well.
Resource Utilization: This metric assesses the efficacy of the models in terms of
CPU, memory, and network resource utilization. Higher utilization rates indicate an
optimal idle resource minimization and still satisfy the needs of the system.
Response Time: The models are tested on how fast they can respond to change in the
workload demand so that real-time scaling of resources takes place without causing
service slowdowns.
These metrics provide a comprehensive assessment of the models' ability to optimize cloud
resource allocation in both simulated and real-world environments.
4.1 Introduction
Cloud computing has been offering scalable and flexible infrastructures for modern
applications. But effective control over cloud resources requires real-time monitoring and
optimization for efficient performance and cost. The most fundamental aspect of this analysis
is the correlation of some metrics that live within a cloud, such as CPU, memory usage, and
network throughput. The main idea behind this analysis was to enhance performance while
reducing costs. In this chapter, I will describe how I applied Python-based exploratory data
analysis (EDA) to recognize the usage patterns and anomalies of resources with the usage of
AWS services such as EC2 and CloudWatch in monitoring and controlling the cloud
infrastructure. The use of machine learning models with tools to monitor cloud infrastructure
identified inefficiencies and provided actionable insight towards optimizing cloud resources.
This was an initial EDA step: analyze the utilization of resources in the cloud, as a function of
time. It was possible to look through the use of CPU, memory, disk throughput, and network
throughput by qualitative analysis of how these resources are consumed based on time. In the
return, it gave me an understanding of what general patterns and peak periods of the resource
being consumed look like.
It contains several plots showing the usage of various cloud resources over time.
The CPU usage (MHz) in blue, which clearly has a mostly stable pattern with some
spikes of very-high workloads.
Memory usage (KB), displayed in green,it fluctuates more than the utilization of CPU,
meaning that, depending on the time of day, there is greater or lesser memory demand.
The disk read throughput (KB/s) and disk write throughput (KB/s) are in red and
purple, respectively. The lines indicate frequent spikes in significant periods of disk I/O
activity.
To understand the interrelations between the various cloud resource metrics, I generated a
pair plot. A pair plot is useful in visualizing the pairwise relationships between variables, in
which any possible correlations and clusters may be identified.
The pair plot visualizes the pairwise relationships between the various cloud resource
metrics, including CPU usage, memory usage, disk throughput, and network throughput.
Off-diagonal scatter plots compare two metrics and how they may interact over time. For
instance, the scatter plot shown for CPU usage vs. network throughput might be used to
understand how these two metrics might interleave during high-demand periods.
Diagonal plots show a distribution of the individual metrics, letting you know how much
spread and distribution of values occurs for each resource metric.
This visualization is significant because it can identify trends in the use of resources not
easily discerned from raw data. For example, the pair plot showed a positive correlation of
CPU usage and network throughput which means higher CPU usage often goes along with
high data transfer.
Figure 4.3: Anomaly Detection for CPU, Memory, and Disk Throughput
This scatter plot highlights the anomalies detected in CPU usage, memory usage, and disk
throughput across the dataset.
Normal resource usage are represented by blue dots, whereas red dots show anomalous
data points. These anomalies indicate abnormalities away from average usage patterns
and could be indicative of future bottlenecks in terms of performance or some unusual
behavior in the cloud environment.
The most dramatic anomalies can be found in the CPU usage and in the disk throughput,
where some peak values in the data set are extremely far away from the range of normal
operation.
Memory usage also exhibited some anomalies, though fewer compared to CPU and disk.
To have a good view of how resources are interlinked with each other, I decided to use a
correlation heatmap. One big takeaway from this visualization is that it enables one to
determine whether strong or weak correlation exists between the metrics involved. This is
indicative of the linkage of how different resources have been interrelated in terms of
resource usage.
Figure 4.4: Correlation Heatmap of Resource Metrics
This heatmap presents the correlation coefficients between key cloud metrics such as CPU
usage, memory usage, disk throughput, and network throughput.
There is a positive correlation observed between the rates of usage of the CPU and
network throughput. This mostly indicates that when an operating system is subjected to
high CPU workloads, it transmits and receives data at the high rate simultaneously.
There is also a positive correlation between the rates of disk read throughput and disk
write throughput; this is also expected because most operations will simultaneously read
from and write to the disk.
There is a weak correlation between memory usage and network throughput, which also
implies that such metrics are more or less independent of each other in most cases.
This kind of correlation analysis is pretty handy for finding potential optimizations of
resources. For example, a highly correlated network activity with CPU usage would mean
one should tend to allocate more CPU resource when there is more network traffic for it not
to choke.
4.3 Cloud Infrastructure Monitoring Using AWS
In the subsequent part of this paper, I explain how AWS services provide cloud infrastructure
monitoring. AWS possesses a variety of tools such as EC2, S3, and CloudWatch; it allows
real-time monitoring, scaling, and managing of cloud resources. These tools were used to
monitor performance within cloud instances to ensure optimal utilization of the assigned
resources while storing backups of project data in a safe place.
AWS EC2 or Elastic Compute Cloud: It is the heart of my cloud computing architecture. It
provides virtual computing resources or instances that can be expanded and decreased in
response to requirements. The EC2 dashboard gives a unified view of all instances running
within the AWS environment.
This is the EC2 Dashboard. A summary that describes all running instances, their status,
health checks, and availability zones is shown here. I'll find this at the base of my dashboard,
a launching point for managing my cloud infrastructure. From this dashboard, I can monitor
the performance of instances, launch new ones, and terminate existing ones.
Setup for an EC2 instance is one of the most important tasks in creating any cloud
infrastructure. In fact, the instances may be configured with various types of virtual hardware
based on application requirements.
Figure 4.6: Launch Instance Configuration
This figure illustrates the steps involved in launching a new EC2 instance.
The instance type was selected as t2.micro, which is part of the AWS free tier and
suitable for low-demand workloads.
Security groups were configured to allow SSH access and Jupyter Notebook access
from specific IP addresses.
After the creation of an EC2 instance, the AWS tools are used for continuous monitoring. The
AWS EC2 instance dashboard offers real-time metrics including CPU usage, disk throughput,
and network performance.
Figure 4.7: EC2 Instance Dashboard
This figure depicts the instance dashboard through which live data pertaining to CPU,
network, and disk usage of an instance could be monitored. Monitoring the above metrics
helps understand the performance of an instance under different workloads and adjust the
resources accordingly.
Figure 4.8: EC2 Instance Details
This graphic describes all the information about an instance of EC2, for example, an instance
ID, availability zone, public IP address, and settings for a security group. It is important for
assistance in diagnosing the problem and configuring the instance correctly according to the
requirements of the project.
One of the key features of EC2 is the ability to connect to instances via SSH (Secure Shell).
This allows remote access to the instance for performing administrative tasks, running
applications, and monitoring performance.
This is the terminal view after connection to the Ubuntu running EC2 instance. Securely
connected to the Ubuntu, I installed necessary libraries, configured the environment, and ran
Python-based scripts that were written to analyze data.
This graphic shows the way to connect to the EC2 instance using the feature EC2 instance
connect. This interface makes it easy to connect to an instance without having a
preconfigured SSH key pair.
Once I was in the EC2 instance, I have installed Jupyter Notebook for my Python
programming and data analysis. Jupyter Notebooks is an interactive environment for the
running of Python code, visualizing data, as well as documenting analysis.
Figure 4.11: Installing Jupyter on EC2
This terminal output shows the installation of Jupyter Notebook on the EC2 instance, which
was then used for executing Python scripts and visualizing cloud resource data.
Figure 4.12: Jupyter Dashboard
This is the Jupyter Notebook Dashboard, which enabled me to access all notebooks available,
so that I could run Python code on the EC2 instance directly. It was mainly the interface
through which I could conduct my data analysis tasks
This is the dashboard of CloudWatch, where it monitors major metrics such as CPU, disk
throughput, and network activity. These metrics help in optimizing the usage of the resources
so that infrastructure stays efficient.
To ensure that project data is securely stored and accessible, I used Amazon S3 for data
backup. S3 provides durable, scalable, and low-cost object storage.
This figure illustrates the creation of an S3 bucket for storing project data. The bucket name
and region are specified to comply with geographical regulations and security requirements.
Figure 4.15: Default Encryption Settings
This is an example of encryption profiles applied to S3 by default to ensure that any data
uploaded here becomes encrypted, hence safeguarded against unauthorized access.
After uploading the project data to the S3 bucket, this figure demonstrates how files can be
securely stored and accessed from the cloud.
This is the terminal output of the backup. So, I just uploaded my Jupyter notebook files to the
S3 bucket. Therefore, the whole project data is continuously backed up, and I can access it at
any time.
Figure 4.18: My Project Dashboard
Figure: The My Project dashboard on S3. This is where all your project files, datasets, and
notebooks are safely stored to later refer to them.
This resource could be managed efficiently with the help of a combination of Python-based
analysis and AWS cloud monitoring. The use of python visualizations helped gain insights
into the deployment patterns of the resources, whereas AWS services offered real-time
monitoring while also providing secured data storage. This combination proved to bring a
multifaceted understanding of the performance in clouds in order to have improvement both
in the optimization of resources as well as cost efficiency.
Conclusion
Cloud computing infrastructures are getting vital, and they require resource management and
monitoring systems to be efficient to reduce the cost and optimize their performance. This
project was based on the analysis of cloud resource usage, anomaly detection, and
optimization of cloud performance through machine learning models developed using Python
and monitoring of the cloud infrastructure using AWS services, including EC2, S3, and
CloudWatch. In this chapter, I will summarize the accomplishments of the project. I will
review the quality of the data set, evaluate the performance of a given model, and discuss
contributions as well as future directions for improving management of clouds.
2. Anomaly Detection: Using machine learning to recognize anomalies within the data,
the task found examples of resource usage anomalies that, if trends continued, might
eventually cause the system crashes or performance issues; Isolation Forest performed
well in highlighting anomalies in all metrics-such as CPU, memory, and disk
throughput-for a more reasoned decision about scaling and optimization opportunities.
3. Cloud Infrastructure Monitoring: AWS services such as EC2, S3, and Cloud Watch
have been used to integrate in the system to monitor and store resources in real time.
These services would continually provide insight into the health and performance of
the cloud service to use resources proactively.
4. Optimization of Cloud Resources: The project showed how machine learning and
cloud services could synergistically optimize resource utilization such that the cloud
resources would be scaled well in response to demand while saving costs.
Overall, the project successfully achieved its objectives of enhancing cloud resource
management through the combined use of machine learning models and AWS services.
1. Handling Missing Values: The data set contained some missing values that were then
handled through the deletion and imputation techniques. It was done to ensure there
was no skew of results arising from missing data points, which else would violate the
integrity of the used machine learning models.
2. Outlier Detection and Removal: I had the use of Z-score analysis for the detection
of outliers during the elimination process. Such outliers were extreme values lying
way beyond the mean of the dataset, that could have crippled the model from learning
normal usage patterns of resources. Their removal, therefore, made sure that training
data was clean, thus enhancing accuracy in anomaly-detection processes.
4. Data Quality: The overall set was very comprehensive and gave a very strong base
to the project. Preprocessing, which also included handling missing values, removing
outliers, and feature scaling, further improved the quality of the dataset, thus ensuring
the reliable training of models and analysis.
5.3 Model Training and Evaluation
The most important contributions of this project were in developing and training machine
learning models for anomaly detection as well as optimizing the usage of cloud resources.
Important models used include, notedly, anomaly detection models such as Isolation Forest.
This process was central to training and evaluation to ascertain the model's accuracy in real-
world cloud environments.
1. Model Selection: The model selected for the anomaly detection is Isolation Forest as
it effectively identifies outliers and unusual data points. This model isolates anomalies
by randomly selecting features and splitting data points to construct a tree. Anomalies
are points that require fewer splits to be isolated, thus this model fits quite well with
cloud resource monitoring.
2. Training Process: The model was trained on the dataset that had been preprocessed
and carried multiple cloud metrics, such as CPU, memory, and network throughput.
Training the model on these features was actually learned normal behavior of the
cloud resources and the effective rate of anomaly detection.
3. Evaluation Metrics: The standard metrics used to evaluate the model were precision,
recall, accuracy and F1-score. These metrics provided a very comprehensive view of
how accurate the model was in the realization to detect anomalies without committing
false positives-wrongly flagging normal activity as anomalous and false negatives-
failing to catch actual anomalies. The model, therefore, showed good precision and
recall scores that means it was effective in determining abnormal patterns of usage of
resources.
4. Real-Time Monitoring and Feedback: The model was trained on AWS CloudWatch
so as to feed real-time feedback about resource usage. Thereby, when an anomaly was
detected, an alert was generated to act immediately enough to correct any form of
performance issues. This could be quite essential in ensuring efficiency when
maintaining such cloud infrastructure.
Overall, the model training and evaluation were successful, demonstrating the ability to
detect anomalies in cloud resource usage and improve cloud performance through proactive
management.
5.4 Contributions and Future Directions
The contributions of this project have high impacts on the management and optimization of
cloud resources. It is by combining machine learning techniques with real-time monitoring of
the cloud infrastructure that it has demonstrated a scalable solution in detecting inefficiencies
and ensuring cost-effective operation of the cloud.
3. Scalability and Automation: The machine learning models developed in this project
can be scaled to bigger data sets for processing bigger size inputs. Even more
complex cloud environments could be handled. Additionally, as the use of AWS
services keeps most of the cloud management tasks like scaling up and down, taking
backups, and anomaly detection properly automated, human intervention is drastically
reduced.
4. Future Directions: There are several avenues for future work based on the findings
of this project. Future efforts could focus on:
[2]. Al-Asaly, M.S., Hassan, M.M. and Alsanad, A., 2019. A cognitive/intelligent resource
provisioning for cloud computing services: opportunities and challenges. Soft Computing, 23,
pp.9069-9081.
[3]. Alyas, T., Ghazal, T.M., Alfurhood, B.S., Issa, G.F., Thawabeh, O.A. and Abbas, Q.,
2023. Optimizing Resource Allocation Framework for Multi-Cloud Environment. Computers,
Materials & Continua, 75(2).
[4]. Belgacem, A., 2022. Dynamic resource allocation in cloud computing: analysis and
taxonomies. Computing, 104(3), pp.681-710.
[5]. Belgacem, A., Beghdad-Bey, K., Nacer, H. and Bouznad, S., 2020. Efficient dynamic
resource allocation method for cloud computing environment. Cluster Computing, 23(4),
pp.2871-2889.
[6]. Belgacem, A., Mahmoudi, S. and Kihl, M., 2022. Intelligent multi-agent
reinforcement learning model for resources allocation in cloud computing. Journal of King
Saud University-Computer and Information Sciences, 34(6), pp.2391-2404.
[7]. Beloglazov, A., Abawajy, J. and Buyya, R., 2012. Energy-aware resource allocation
heuristics for efficient management of data centers for cloud computing. Future generation
computer systems, 28(5), pp.755-768.
[8]. Calheiros, R.N., Ranjan, R. and Buyya, R., 2011, September. Virtual machine
provisioning based on analytical performance and QoS in cloud computing environments. In
2011 International Conference on Parallel Processing (pp. 295-304). IEEE.
[9]. Chen, Z., Hu, J., Min, G., Luo, C. and El-Ghazawi, T., 2021. Adaptive and efficient
resource allocation in cloud datacenters using actor-critic deep reinforcement learning. IEEE
Transactions on Parallel and Distributed Systems, 33(8), pp.1911-1923.
[10]. Ebadi, M.E., Yu, W., Rahmani, K.R. and Hakimi, M., 2024. Resource Allocation in
The Cloud Environment with Supervised Machine learning for Effective Data Transmission.
Journal of Computer Science and Technology Studies, 6(3), pp.22-34.
[11]. Gai, K., Qiu, L., Zhao, H. and Qiu, M., 2016. Cost-aware multimedia data allocation
for heterogeneous memory using genetic algorithm in cloud computing. IEEE transactions on
cloud computing, 8(4), pp.1212-1222.
[12]. Ghelani, D., 2024. Optimizing Resource Allocation: Artificial Intelligence Techniques
for Dynamic Task Scheduling in Cloud Computing Environments. International Journal of
Advanced Engineering Technologies and Innovations, 1(3), pp.132-156.
[13]. Goswami, M.J., 2020. Leveraging AI for Cost Efficiency and Optimized Cloud
Resource Management. International Journal of New Media Studies: International Peer
Reviewed Scholarly Indexed Journal, 7(1), pp.21-27.
[14]. Hameed, A., Khoshkbarforoushha, A., Ranjan, R., Jayaraman, P.P., Kolodziej, J.,
Balaji, P., Zeadally, S., Malluhi, Q.M., Tziritas, N., Vishnu, A. and Khan, S.U., 2016. A
survey and taxonomy on energy efficient resource allocation techniques for cloud computing
systems. Computing, 98, pp.751-774.
[15]. Hassan, K.M., Abdo, A. and Yakoub, A., 2022. Enhancement of health care services
based on cloud computing in IOT environment using hybrid swarm intelligence. IEEE
Access, 10, pp.105877-105886.
[16]. Kamble, T., Deokar, S., Wadne, V.S., Gadekar, D.P., Vanjari, H.B. and Mange, P.,
2023. Predictive Resource Allocation Strategies for Cloud Computing Environments Using
Machine Learning. Journal of Electrical Systems, 19(2).
[17]. Karamthulla, M.J., Malaiyappan, J.N.A. and Tillu, R., 2023. Optimizing Resource
Allocation in Cloud Infrastructure through AI Automation: A Comparative Study. Journal of
Knowledge Learning and Science Technology ISSN: 2959-6386 (online), 2(2), pp.315-326.
[18]. Madni, S.H.H., Abd Latiff, S.I.M., Coulibaly, Y. and Abdulhamid, S.I.M., 2016. An
appraisal of meta-heuristic resource allocation techniques for IaaS cloud.
[19]. Madni, S.H.H., Latiff, M.S.A., Coulibaly, Y. and Abdulhamid, S.I.M., 2017. Recent
advancements in resource allocation techniques for cloud computing environment: a
systematic review. cluster computing, 20, pp.2489-2533.
[20]. Mohamed, Y.A. and Mohamed, A.O., 2022, July. An Approach to Enhance Quality of
Services Aware Resource Allocation in Cloud Computing. In International Conference on
Information Systems and Intelligent Applications (pp. 623-637). Cham: Springer
International Publishing.
[21]. Naha, R.K., Garg, S., Chan, A. and Battula, S.K., 2020. Deadline-based dynamic
resource allocation and provisioning algorithms in fog-cloud environment. Future Generation
Computer Systems, 104, pp.131-141.
[22]. Nzanywayingoma, F. and Yang, Y., 2017. Efficient resource management techniques
in cloud computing environment: Review and discussion. Telkomnika, 15(4), pp.1917-1933.
[23]. Qawqzeh, Y., Alharbi, M.T., Jaradat, A. and Sattar, K.N.A., 2021. A review of swarm
intelligence algorithms deployment for scheduling and optimization in cloud computing
environments. PeerJ Computer Science, 7, p.e696.
[24]. Rajawat, A.S., Goyal, S.B., Kumar, M. and Malik, V., 2024. Adaptive resource
allocation and optimization in cloud environments: Leveraging machine learning for efficient
computing. In Applied Data Science and Smart Systems (pp. 499-508). CRC Press.
[25]. Sharkh, M.A., Jammal, M., Shami, A. and Ouda, A., 2013. Resource allocation in a
network-based cloud computing environment: design challenges. IEEE Communications
Magazine, 51(11), pp.46-52.
[26]. Sharma, S. and Rawat, P.S., 2024. Efficient resource allocation in cloud environment
using SHO-ANN-based hybrid approach. Sustainable Operations and Computers, 5, pp.141-
155.
[27]. Sharma, S., 2022. An Investigation into the Optimization of Resource Allocation in
Cloud Computing Environments Utilizing Artificial Intelligence Techniques. Journal of
Humanities and Applied Science Research, 5(1), pp.131-140.
[28]. Sheeba, A., Gupta, B., Malathi, L. and Saravanan, D., 2023. SWARM
INTELLIGENCE OPTIMIZATION FOR RESOURCE ALLOCATION IN CLOUD
COMPUTING ENVIRONMENTS. ICTACT Journal on Soft Computing, 13(4).
[29]. Shukur, H., Zeebaree, S., Zebari, R., Zeebaree, D., Ahmed, O. and Salih, A., 2020.
Cloud computing virtualization of resources allocation for distributed systems. Journal of
Applied Science and Technology Trends, 1(2), pp.98-105.
[30]. Sindhu, V. and Prakash, M., 2022. Energy-efficient task scheduling and resource
allocation for improving the performance of a cloud–fog environment. Symmetry, 14(11),
p.2340.
[31]. Sonkar, S.K. and Kharat, M.U., 2016, November. A review on resource allocation and
VM scheduling techniques and a model for efficient resource management in cloud
computing environment. In 2016 International Conference on ICT in Business Industry &
Government (ICTBIG) (pp. 1-7). IEEE.
[32]. Su, Y., Bai, Z. and Xie, D., 2021. The optimizing resource allocation and task
scheduling based on cloud computing and Ant Colony Optimization Algorithm. Journal of
Ambient Intelligence and Humanized Computing, pp.1-9.
[33]. Tang, H., Li, C., Bai, J., Tang, J. and Luo, Y., 2019. Dynamic resource allocation
strategy for latency-critical and computation-intensive applications in cloud–edge
environment. Computer Communications, 134, pp.70-82.
[34]. Thein, T., Myo, M.M., Parvin, S. and Gawanmeh, A., 2020. Reinforcement learning
based methodology for energy-efficient resource allocation in cloud data centers. Journal of
King Saud University-Computer and Information Sciences, 32(10), pp.1127-1139.
[35]. Tsai, J.T., Fang, J.C. and Chou, J.H., 2013. Optimized task scheduling and resource
allocation on cloud computing environment using improved differential evolution algorithm.
Computers & Operations Research, 40(12), pp.3045-3055.
[36]. Vinothina, V.V., Sridaran, R. and Ganapathi, P., 2012. A survey on resource allocation
strategies in cloud computing. International Journal of Advanced Computer Science and
Applications, 3(6).
[37]. Yakubu, I.Z., Aliyu, M., Musa, Z.A., Matinja, Z.I. and Adamu, I.M., 2021. Enhancing
cloud performance using task scheduling strategy based on resource ranking and resource
partitioning. International Journal of Information Technology, 13(2), pp.759-766.
[38]. Younis, M.F., 2024. Enhancing Cloud Resource Management Based on Intelligent
System. Baghdad Science Journal, 21(6), pp.2156-2156.
[40]. Zhao, M. and Wei, L., 2024. Optimizing Resource Allocation in Cloud Computing
Environments using AI. Asian American Research Letters Journal, 1(2).