0% found this document useful (0 votes)
6 views

Modeling An Optimized Approach For Load Balancing in Cloud

Research paper
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Modeling An Optimized Approach For Load Balancing in Cloud

Research paper
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Received August 22, 2020, accepted September 8, 2020, date of publication September 14, 2020, date of current version

October 1, 2020.
Digital Object Identifier 10.1109/ACCESS.2020.3024113

Modeling an Optimized Approach for Load


Balancing in Cloud
MUHAMMAD JUNAID1 , ADNAN SOHAIL 1 , RAO NAVEED BIN RAIS 2 ,
ADEEL AHMED 3 , (Graduate Student Member, IEEE), OSMAN KHALID4 ,
IMRAN ALI KHAN 4 , SYED SAJID HUSSAIN 4 , AND NAVEED EJAZ1
1 Department of Computing, Iqra University, Islamabad 46000, Pakistan
2 Department of Electrical and Computer Engineering, College of Engineering and Information Technology, Ajman University, Ajman, United Arab Emirates
3 Department of Computer Science, Quaid-i-Azam University, Islamabad 45320, Pakistan
4 Department of Computer Science, COMSATS University Islamabad, Abbottabad 22060, Pakistan

Corresponding author: Rao Naveed Bin Rais ([email protected])


This work was supported in part by Ajman University, United Arab Emirates, through the Deanship of Graduate Studies and
Research (DGSR).

ABSTRACT Despite significant infrastructure improvements, cloud computing still faces numerous chal-
lenges in terms of load balancing. Several techniques have been applied in the literature to improve load
balancing efficiency. Recent research manifested that load balancing techniques based on metaheuristics
provide better solutions for proper scheduling and allocation of resources in the cloud. However, most of
the existing approaches consider only a single or few QoS metrics and ignore many important factors.
The performance efficiency of these approaches is further enhanced by merging with machine learning
techniques. These approaches combine the relative benefits of load balancing algorithm backed up by
powerful machine learning models such as Support Vector Machines (SVM). In the cloud, data exists in
huge volume and variety that requires extensive computations for its accessibility, and hence performance
efficiency is a major concern. To address such concerns, we propose a load balancing algorithm, namely,
Data Files Type Formatting (DFTF) that utilizes a modified version of Cat Swarm Optimization (CSO)
along with SVM. First, the proposed system classifies data in the cloud from diverse sources into various
types, such as text, images, video, and audio using one to many types of SVM classifiers. Then, the data is
input to the modified load balancing algorithm CSO that efficiently distributes the load on VMs. Simulation
results compared to existing approaches showed an improved performance in terms of throughput (7%),
the response time (8.2%), migration time (13%), energy consumption (8.5%), optimization time (9.7%),
overhead time (6.2%), SLA violation (8.9%), and average execution time (9%). These results outperformed
some of the existing baselines used in this research such as CBSMKC, FSALB, PSO-BOOST, IACSO-SVM,
CSO-DA, and GA-ACO.

INDEX TERMS Classification, cloud, SVM, load balancing, metaheuristics, virtual machine.

I. INTRODUCTION For instance, in [1], the authors have applied the Bin-packing
Over the years, an increase in online applications has resulted algorithm for multi capacity Bin-packing to achieve task
in huge volumes of data accumulated daily. Generally, waiting time and degree of imbalance on cloud resources. In a
the data is classified into different types, such as audio, video, similar work [2], the authors used the Bin-packing algorithm
image, and text. Despite the significant evolution of clouding for cost-aware and fragmentation enabled consolidation of
computing to handle such diverse data, still it faces numerous tasks to achieve minimum energy consumption. In a work
challenges in real-time processing and load balancing of by [3], the authors used a dynamic clustering algorithm
resources employed to process mega volumes of data. to achieve throughput and execution time. A study by [4]
In the past few years, several load balancing approaches applied a dynamic real clustering algorithm for achieving
have been developed for cloud computing, such as [1]–[5]. geographical load balancing in the cloud that results in better
throughput and response time. In [5], the authors applied
The associate editor coordinating the review of this manuscript and adaptive load balancing to achieve optimal resource provi-
approving it for publication was Adnan Shahid. sioning resulting in better resource utilization and throughput.

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
173208 VOLUME 8, 2020
M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

However, most of the traditional load balancing approaches and SVM are widely used for text classifications in high
suffer from high computational cost, energy consumption, dimension space providing high accuracies [21]–[24]. Image
several overheads, scalability, and deadline constraints. classification involves the selection of image feature subsets
In recent years, the research trend has shifted towards from large feature space. Selecting optimum features is a
metaheuristics-based approaches for load balancing, as these complex process in image classification for load balancing
techniques are better in addressing flexibility, multimodal that is solved by several hybrid metaheuristic techniques such
optimization, efficient randomization, discontinuous prob- as CSO, GA, ACO, PSO, with SVM, NN, K-NN [25]–[28].
lems through intensification (exploitation), and diversifica- Despite several advantages, the aforementioned approaches
tion (exploration) respectively. The authors in [6] presented have certain deficiencies and therefore, there is still a need
a metaheuristics approach for load balancing using mod- for multi-factor optimal solutions for load balancing.
ified PSO in which they minimized tasks overhead and Our proposed work focuses on the development of a
maximize resource utilization over varying VMs and tasks. new load balancing algorithm named Data Files Type For-
In [7], the authors combined ACO and PSO in a hybrid matting (DFTF) that combines SVM (a machine learning
metaheuristics load balancer ACOPS which uses the histor- classifier) with modified Cat Swarm Optimization (CSO)
ical information to predict future workload of the VMs in algorithm (a scheduling algorithm). The proposed DFTF
the cloud. This approach helps in reducing computational algorithm considers multi-factor QoS metrics, such as energy
time while keeping optimum load balancing among VMs consumption, response time, SLA violations, migration time,
and tasks. This metaheuristic approach helps in finding a optimization time, execution time, throughput time, and over-
local and global best position in the solutions with fast con- head time as performance evaluation measures. In this work,
vergence and hence performing better than many heuristic SVM is applied using one-to-many classifications for gen-
approaches. Similarly, In [8], the authors employed SVM erating the data class over a set of file formats, such as
in cloud load balancing metaheuristics ACO to achieve bet- audio, video, text, and images in the cloud environment. The
ter throughput, SLA, migration, overhead, and optimization classification process can easily reduce such complexities
but lacks other critical factors such as energy consumption, while performing offline preprocessing and make the data
response time and execution time. Similarly, most of the available in the processed form [8]. This refined form of data
existing metaheuristic techniques are covering either one or when applied to load balancing can significantly improve
a few optimization parameters but ignoring other critical scheduling using QoS parameters [29].
factors that can play a pivotal role in achieving multi-factor Original CSO is more suitable for small population size
optimization [9]. Moreover, the issues faced in the cloud with a minimum number of iterations and hence not providing
due to load balancing can be further minimized with the good solutions in a situation where the processing involves
combination of metaheuristics and data mining techniques a large number of complex tasks [30], [31]. This drawback
to solve complex optimization problems more efficiently eventually leads CSO to fall into local optimum which takes
[10]–[13]. Nowadays, Cloud Data Mining (CDM) is gain- more iterations in finding solutions space and hence make
ing popularity in which machine learning models, such CSO computationally complex [32]. Therefore, we have
as supervised machine learning are integrated with cloud modified the original CSO by introducing a new grouping
load balancing approaches that result in new efficient algo- phase process that takes the data files into four groups:
rithms [14], [15]. Similarly, the classification of multiple audio, video, image, and text taking from SVM keeping in
file types in the cloud can achieve an improved load bal- view the properties associated with each group. In doing so,
ancing with increased accuracy due to the pre-assignment the population of cats in the sub-groups is sorted and later
and categorization of data for virtual machines with differ- in the stage, the best fitness value of the cat in local best
ent resources. For instance, the audio classification exists solution is selected. Thereby, integrating the two approaches
in various forms such as noise, speech, silence, and music, SVM and CSO into a merged one addresses their individual
etc, and can achieve performance efficiency using deep limitations and reinforces their combined benefits into a sin-
learning algorithms such as Convolutional Neural Network gle combined model. Earlier data type approaches such as
(CNN) [16]. Similarly, video datasets need proper catego- AWS and PostgreSQL focus only on data types classification
rization and automatic classification for quick retrieval and but not on file format types [33]. However, the proposed
indexing. This helps in understanding the semantic gap to approach is using file format types for classification in the
minimize computational complexity. Integrated metaheuris- cloud environment and then uses the resultant data class into
tic algorithms such as ACO, ABC, PSO, etc. are used in load balancing algorithm modified CSO for load balancing.
several ways to attain more accuracy in video classification This combination has outperformed some sate of the art
using classifiers such as SVM, KNN, and NN [17]–[19]. metaheuristic load balancing algorithms used in this study
It has been observed that a huge increase in text documents is such as CBS-MKC [34], FSALB [35], PSO-BOOST [36],
making the extraction process quite complex. Text clustering IACSO-SVM [37], CSO-DA [38] and GA-ACO [39].
is used for text mining to categorize the text documents but The main objective of this research is to propose a
cannot perform text feature selection [20]. The metaheuris- new optimized metaheuristic algorithm DFTF in a cloud
tic algorithms and classifiers such as GA, HS, PSO, NN, that performs classification and load balancing effectively.

VOLUME 8, 2020 173209


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

This optimization model addresses the limitations of the ear- these challenges are being discussed which mainly rotates
lier load balancing approaches by its multi-factor approach. around our developed QoS metrics in the below sections.
Further, the contributions of this paper include: Cloud computing is making significant contributions in
• A new algorithm DFTF is developed based on SVM extracting useful information when combined with data min-
and modified CSO that provides better-optimized load ing techniques. This combination called cloud data min-
balancing in the cloud environment. ing (CDN) has made easy information retrieval from huge
• The classification of data file formats into audio, video, volume and variety of data with the help of load bal-
image, and text is performed in a cloud environ- ancing approaches. Similarly, several real-time cloud data
ment that shows an improved classification accuracy mining frameworks, algorithms, and services are avail-
in confusion matrix such as accuracy, precision, recall, able that provide information through a number of appli-
and F-measure over state-of-the-art classifiers helping cations [43]. These real-time cloud data mining and load
in decreasing computational complexities later in the balancing applications are VM tasks classification for load
scheduling phase. balancing, feature extraction, anomaly, and intrusion detec-
• The proposed DFTF model has provided improved tion, open shop scheduling, attribute importance, spatial
results for energy consumption, response time, SLA classifications, data analysis and satellite imagery [44],
violations, migration time, optimization time, execution spectral and statistical data analysis [45], gene expres-
time, throughput time, and overhead time as perfor- sion data mining and bioinformatics [46], geo-spatial anal-
mance evaluation measures. ysis and geo-informatics, large-scale mining in big data
and web mining [47], machine-learning applications [48],
The rest of this paper is organized as follows. A literature high-dimensional data mining [49], highly diversified and
review is presented in Section 2, the proposed methodology is dense data mining in rule mining [46], the security of data
discussed in Section 3, the experimental setup is described in in the cloud, clustering, datacenters resources optimization
Section 4, results and analysis are discussed in Section 5 and in the cloud, noise removal, reactive power problem, face
conclusions are presented in Section 6. recognition, biomedical image processing, teaching based
learning, manufacturing design, water resource problem and
II. LITERATURE REVIEW routing optimization.
The load balancing algorithms are classified as dynamic, This research is mainly focused on classification, specifi-
static, or hybrid and it depends on the machine state. They cally the combination of classifier with a load balancer in the
are also known as allocation and scheduling algorithms based cloud. So, most of the presented information rotates around
on the features used during load balancing. Further, they are classification and load balancing. In classification, data is
categorized as Cloud Data mining load balancing, VM load assigned to appropriate classes using supervised machine
balancing, CPU based load balancing, Task-based load bal- learning techniques. The classification consists of many tech-
ancing, Server-based load balancing, Network-based load niques such as SVM, decision tree, Bayesian classifier, NN,
balancing, and Standard Cloud load balancing based on their belief networks, and Rule-based classifiers. Cloud data min-
combination. Numerous studies discussed the limitations of ing classification techniques including K-NN, Point data,
the load balancing algorithms for proposing more effective NB, GA, and their hybrids such as NB with SVM, GA with
methods. The study lacks discussion of some essential QoS SVM, ACO with SVM, ACO with NN, GA with K-NN, etc.
metrics, such as migration duration, migration expense, ser- These combinations help in getting the highest accuracies in
vice quality breach, task failure rate, algorithmic efficiency, classification and further reducing computation complexities
percentage of load balancing measures, and level of balance. in load balancing over the number of applications.
The algorithm for load balancing must improve responsive- In [50], the authors proposed a hybrid metaheuristic
ness, implementation cost, implementation time, through- algorithm called WOA-AEFS. They solved the resource
put, fault sensitivity, migration duration, makespan, resource scheduling problem in cloud computing. This study has two
throughput, and usability. At the same time, energy consump- scheduling approaches that consider makespan and cost.
tion, carbon pollution, relocation costs, energy efficiency, The algorithm outperformed other metaheuristic algorithms
and SLA needs more consideration [40]–[42]. It has been such as original BAT and PSO but did not consider factors
observed that in reducing the efficiency of the load balancing, such as execution time and performance efficiency. Extended
algorithm complexity is not given much consideration. The BAT Algorithm (EBA) is suggested by [51] that modifies
studies also concluded that several issues remain a huge three benchmark functions such as Ackley, Hyperellipsoid,
challenge in the load balancing that can be traversed in the and Rosenbrock resulting in better performance of search-
future by implementing an adequate, effective, and robust ing optimum solutions, fitness function, and convergence
load balancing algorithm. The decline of these dimensions’ rate. The algorithm outperformed other metaheuristics, such
leads to poor QoS at Cloud Service Centers and a decreased as MMBO, and MBO-FS in the same cloud but compu-
economy for CSP. However, keeping QoS and economics into tational complexity is a major concern. A study by [52]
consideration, delivering optimized multi-factor QoS based suggested Gravitational Search Algorithm (GSA) for load
solutions is becoming a major challenge for CSPs. Some of balancing in the cloud where it proved strong convergence

173210 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

over a set of iterations. The drawback of this algorithm benchmark functions have shown the better performance
is computation-intensive that effects its scalability. A study of each metaheuristic in best, worst, average, and SD
by [53] proposed hybrid IRRO-CSO which is inspired by scenarios. However, multi-objective QoS metrics are not
the perception of information flow in raven social behavior considered.
among members in searching food while CSO is based on A dynamic VM migration algorithm MMA is suggested
chicken behavior during a search of food. The algorithm by [89] suitable for High-Performance Computing (HPC).
has been validated against CEC 2017 benchmark functions MMA algorithm minimizes the load on overloaded machines
resulting in showing better performance over BAT, PSO, and reduced communication costs. However, saturation can
and CSO. However, the performance needs to be checked cause performance degradation and computational complex-
on real-time large datasets that improve the execution ity. A study by [90] discussed a VM migration strategy
time. that provides better scalability over other strategies. In this
In [82], modified Heterogeneous Earliest Finish Time approach, a composite scoring function is used and find the
(MHEFT) is proposed for dynamic load balancing in the host that has workload handling capability. At this stage,
cloud resulting. The algorithm works well under a smaller migration takes place and load is transferred to it. However,
number of tasks and did not consider other QoS metrics the high computation cost, overhead, and multi-objective
for performance evaluations. CEGA is a genetic inspired approach are lacking. An improvement in reduced overhead
balancing algorithm designed to meet deadline constraints using a metaheuristic automatic power-aware algorithm in
while reducing the execution time of the tasks [83]. Results VANETs is presented by [91]. This research uses PSO that
of the CEGA have shown better performance on the same achieves energy-efficient communication. However, perfor-
workflows but the algorithm suffers exponential time com- mance degradation of 8% is observed and computation cost
plexity. A variation of CSO called Improved Cat Swarm is high. Biological Inspired Self Organized Autonomous
Optimization (ICSO) is proposed by [79]. Here, the first Routing Protocol (BIOSARP) is proposed by [92] for the
modification is improving the tracing mode with changing earliest searching of a neighbor using an optimal decision
position and velocity equations. Similarly, the other modifi- by ACO. This helps in reducing overload to a significant
cation is to make changes in such a way that local optima are extent but the algorithm has a relatively higher overhead
prevented. However, the algorithm is only tested on bench- cost.
mark functions with fewer tasks whereas, the number of A study by [93] discussed load balancing in multi-core
QoS metrics is not considered. In a study by [84], authors clusters using frequent data mining in the cloud. In their,
developed a hybrid metaheuristic algorithm HBMMO for work SDFEM is proposed that provides high mining perfor-
workflow scheduling in cloud computing for improving mance in the large complex data analytic real-time applica-
throughput in the cloud. The algorithm takes a multi-objective tions. They used a hybrid approach of OpenMP and MPI
approach such as quantization of execution cost, throughput, and tested their implementation on 12 core shared memory
and makespan. Despite many factors discussed in the study; nodes. The results have shown a remarkable increase in
energy consumption is not considered. In [85], Weighted performance that is much faster and reliable. However, this
Wavelet SVM (WW-SVM) is suggested for estimating load combination has some complexities especially computational
sequences in a data center using a cloud computing environ- and memory complexities. In [49], the authors suggested
ment. For parameter selection and optimization, PSO is used a pattern mining load balancing technique for high dimen-
to make a final prediction. The proposed algorithm outper- sion data PaMPa-HD based on Map-reduce. The algorithm
forms other baselines in terms of execution time, throughput, performed well in terms of robustness and execution time
and error prediction. The algorithm considers only predic- due to its inherent properties of better mining patterns and
tion and accuracy while simple multi objectives are not the least amount of transactions. However, there are a large
discussed. number of items per transaction. The authors in [94] pre-
An improved SLA violation for load balancing in the sented metaheuristic EELBF Firefly load balancing algo-
cloud with the objective of minimum resource wastages is rithms in the cloud in which throughput and response time
discussed by [86]. The algorithm used optimum resources are focused. This algorithm besides finding relational min-
in which there is less failure rate of fulfilling tasks and ing models provides better energy consumption by balanc-
maintaining low energy usage and least SLA violations up ing workload in multiple VMs (considering less loaded and
to 17%. The algorithm did not consider execution time high loaded VMs). The algorithm has been implemented in
and numerous scientific evaluations. Sharma et al. proposed CloudSim 4.0 and compared with ACO, HBB, and WRRLB,
SLA agile-based VM to reduce response time [87]. This and overall better performance is observed. A study by [95]
research used ghost VM to reduce the VM creation time presented an energy-efficient load balancing algorithm that
by approximately 12%. Static workloads are used, and per- uses the combination of BWM and TOPSIS methodology
formance metrics are not discussed. In [88], comprehensive for a multi-objective mining approach. The selection of most
comparisons of SLA violations in a cloud environment are appropriate cloud scheduling solutions is performed in two
performed for five metaheuristics algorithms in which QoS steps in which initially a decision criterion is defined followed
constraints and penalty costs are considered. Experiments on by BWM for weights assigning and then TOPSIS is applied

VOLUME 8, 2020 173211


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

TABLE 1. Cloud data mining using load balancing algorithms.

to measure the performance of each alternative. Experi- better results in terms of makespan, energy consumption, and
ments have shown that the proposed algorithm has attained VM utilization over other baselines. However, this study did

173212 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

not consider large scale datacenters which provide scalability A. DATA CLASSIFICATION BASED ON SUPPORT
and reliability of the proposed solution over a larger number VECTOR MACHINE
of tasks. In [96], the authors discussed the PSO based load We collected the data from different cloud sources and then
balancing algorithm used for resource allocations in cloud preprocessed the data to transform it as per our model require-
computing. The algorithm finds task initiation overload on ments. The data format of the collected data from the cloud is
VMs by optimized migration transfers to other VMs. As a comprised of video, audio, text, and images. These data sets
result, the algorithm achieved reduced execution time and are diverse and are of different sizes. In the proposed model,
transfer time. However, this algorithm is only considering a at first, the SVM classifier determines the type of data (audio,
few tasks based on a single factor thereby not addressing scal- video, text, image) based on features and then classify the data
ability issues. A study by [97] presented a new adaptive inte- by assigning it to a particular class.
grated approach based on best-worst decision making and the We have divided the VMs into four types of sets, such
ranking method called VIKOR which is used to define tasks’ as AudioVM, VideoVM, TextVM, and ImageVM based
priorities. This algorithm uses a compromised approach in on input data. Each set of VM has different process-
which group benefits are maximized over individual losses. ing and storage resources in a cloud environment. More
The algorithm provides better reliability by keeping all precisely, each machine (VM) is assigned a task based
VMs in the process during runtime. Further, the algorithm on task requirements. For example, video tasks require
achieves better throughput, reduced makespan, improved 1000 floating-point operations and 16GB memory, audio
waiting time, more virtual machine (VM) utilization, and tasks require 800 floating-point operations and 12GB mem-
less VM usage cost when compared with other baselines. ory, image tasks require 800 floating-point operations, and
However, a maximum of 1000 tasks is considered for various 8GB memory, textual tasks require 400 floating-point opera-
QoS performance metrics which means that scalability may tions and 4GB memory. After that, the SVM classifier iden-
be the issue when tasks are significantly increased along tifies the set of VM types such as VideoVM, AudioVM,
with VMs. TextVM, ImageVM, based on the requirements, size, and
It is observed from the number of studies presented here features of the tasks. Here the respective VM is assigned
that no comprehensive multi-factor approach is adopted that concerning each task. Hence, SVM intends to classify data
optimizes the QoS metrics without effecting the quality and match it to the most suitable class type and VM type.
solution. For video data classification, we extracted feature vec-
tors of sequences of 40 frames extracted from four different
III. PROPOSED METHODOLOGY video classes, where we have a 40 × 4096 matrix, where
We combined our approach using SVM and CSO to make each row refers to features of one frame (one frame per
a hybrid model called DFTF with the objective of improving row), so we classified videos between these four different
the load balancing and performance in the cloud environment. classes. We preprocess a new video to limit its number of
The architecture of the proposed DFTF approach is shown frames and then extract features from this video to classify
in Figure 1. This architecture is divided into two main mod- it. Assume that we have four video classes (ci, i = 1, .., 4).
ules: ‘Data Classification based on SVM’ and ‘Load Balanc- Each video has 40 (n = 1, .., 40) frames and from each frame
ing using CSO’. The input to the data classification module is we extracted 4096 features ([1×4096]). Since each frame has
the collection of diverse data in the form of video, text, audio, enough information to predict the video class (ci) so we used
and images, which are stored in the cloud environment. The 40 frames from each video as training/test samples, which
classification module takes the input data randomly and then creates an input matrix of [160 × 4096] dimensions, with
performs the classification on these data using polynomial 160 samples and each sample have 4096 features. Addition-
SVM. The output of this algorithm is in the form of the ally, we have created an output vector [160 × 1] that contains
partitioned data class. The second module performs load the label of each class ci = i, where i = 1 . . . , 4.
balancing using Cat Swarm Optimization (CSO). The per- For audio data classification, four feature sets of audios are
formance analysis of the proposed model is then performed evaluated for identifying five kinds of audio classes: classi-
to achieve an efficient load balancing by considering the cal music, popular music, crowd noise, speech, and simple
parameters such as execution time, number of migrations, noise. The feature sets include low-level signal properties,
optimization time, throughput time, and overhead time. The mel-frequency spectral coefficients [98], and two new sets
various tools used in this research are CloudSim 4.0 and Java based on perceptual models of hearing. For image classifica-
environment. tion, we have considered 256 × 256 pixels (total 65,536 pix-
Algorithm 1 describes the process of the proposed model els). We used each pixel as a feature in the SVM classifier.
called DFTF. In this algorithm, Lines 1 to 11 performed data For text classification, there are text documents of about
categorization that first classifies the type of data and then 6GB which are extracted in the form of unstructured
classifies the type of VMs using SVM and assigned it to text. We performed stemming and stop word removal and
the particular class. Lines 12 to 32 performed load balancing extracted the words in the form of features. We then
using CSO and then output the schedule data. used these features for text data classification using SVM.

VOLUME 8, 2020 173213


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

FIGURE 1. Proposed architecture of DFTF.

SVM works on the principle of linear classification with Equation (1) shows the polynomial function,
a special type of rule that generates classes with effec-  s
tive performance and is based on the quality of classifi- POLY (u, v) = uk v + 1 , (1)
cation. Kernel trick can be used in the construction of a where ‘s’ is the polynomial degree.
special kind of non-linear method using SVM. There are The polynomial kernel function is used with SVMs and
two types of classic kernel functions that are used in SVMs, other kernel models representing the similarity between
one of them is the radial basis function kernel and the features over the polynomials of the original variables.
other is a polynomial kernel. where, ui is used for support A polynomial kernel is defined as:
vector, ∝i is represented as Lagrange multiplier and uj is K (x, xi ) = 1 +
X
(xxxi )d . (2)
known as the label of membership class (+1, −1) where
n = 1, 2, 3 . . .. N. Here, d=1, this confirms to the linear kernel.

173214 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

Algorithm 1 DFTF
Input: video, text, audio, image, N, number of virtual
machines (VM), number of cats, max_iterations, SMP
(seeking memory pool) = 5 to 10, MR is the Mixture ratio,
Sz: Size of the population, maximum_iter
Output: Data class, Scheduled data
1:for data classification do
2: for each P (u, v) do
3: Evaluate = SVM
4: for each Classification accuracy 6 = 100 do
5: Evaluate data accuracy
6: if max_ iterations 6 = N then
7: perform data categorization and VM
categorization
8: end if FIGURE 2. The network of Virtual Machines (VMs).
9: end for
10: end for where V represents the virtual machine (VM) or node and E
11: end for represents the undirected edge having a probability a weight
12: for load balancing do that shows the overload and underload intensity between
13. Create N cats and divide them into four groups G, that two nodes. After the data classification, load balancing is
is G={audio, video, text, image}. performed using CSO. In the load balancing phase, these data
14. Randomly initialize velocities to each cat belongs to are called tasks. Let us assume that:
group G.
15. Evaluate initial fitness function Fi VideoVM = {VM 1 , VM 2 , . . . . . . , VM n } ,
16. Csz = Create Cpop (sz, Fi ) //Create cat population AudioVM = {VM 1 , VM 2 , . . . . . . , VM n ,
// Distribute the cats in seeking or tracing mode TextVM = {VM 1 , VM 2 , . . . . . . , VM n } ,
17. while k ≤ maximum_iter do
ImageVM = {VM 1 , VM 2 , . . . . . . , VM n }
18. for each i = 1 to Sz do
19. if C[i] = Seekm then be the set of virtual machines for video, audio, text, image,
20. Sol = Apply Seekm (Cj ) respectively. Each set of machines is responsible to execute
21. else one task. Each task is executed for a period of maximum
22. Sol = Apply Tracem (Cj ) iterations and is evaluated using computational cost in the
23: end if form of time. The mapping of tasks on virtual machines is
24. Fbest = Sol best computed using the SVM, where each machine is assigned a
25. if C(F,k) detected then task based on requirements, size, and features of the tasks.
26. Csz = create Cpop [sz, F] Once the number of tasks and VMs are selected,
27. else the scheduling process will be initiated. Initially, N instances
28. Csz = reset Cpop [Csz ] are created and split into G groups. CSO takes into con-
29: end if sideration the behavior of the cats into two modes that
30. end for are seeking mode and tracing mode. Swarm algorithms are
31. end while widely accepted as they adapted the best-obtained solutions
32. return F (schedule data) //return best solution for searching the most similar neighbors (nodes). So, in this
33. Exit. method, the cat behavior is considered for searching a solu-
tion space. Every cat has its position having d-dimensions
with different velocities used for each dimension. Every cat is
The output of the classification phase is in the form of clas- evaluated using fitness function, if the fitness is not equal then
sified tasks, thus reducing the computational cost such as to compute the probability using equation (3), and by default,
avoid preprocessing of features learning, features extraction, the probability value is set to 1. We used the Boolean flag
data conversion, data transformation, and data classification variable to identify whether the cat is in seeking mode or
at the scheduling phase. tracing mode. The tracing mode is considered in terms of
its fitness function where the position of the cat is changed
according to the fitness function. The fitness function of CSO
B. LOAD BALANCING USING CAT SWARM OPTIMIZATION
can be obtained with the help of equation (3).
We developed the network of VMs in the form of an undi-
rected weighted graph as shown in Figure 2. The VMs’ FS i − FS max
Pi = , where 0 < i < j, (3)
network can be represented as an undirected graph G = (V, E) FS max − FS min

VOLUME 8, 2020 173215


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

where Pi shows the probability, value associated with the TABLE 2. Parameter settings used for DFTF.
position of ith Cat. FS i is the fitness of ith cat, FS max rep-
resents the maximum fitness value and FS min represents the
minimum fitness value achieved so far.
The tracing mode of the CSO method is described in terms
of the movement of cats that is based on the outstanding
hunting skills of cats. In tracing mode, the movement of cats
is according to their velocities in each dimension and then
updating their positions accordingly. The updated positions of
cats and velocities are calculated using equations (4) and (5).
These equations are:

Vi,d (t) = Vi,d +ri ci Xbest,d −Xi,d , d = 1, 2, . . . M , (4)




Xi,d(t) = Xi,d + Vi,d , (5)


TABLE 3. Parameter settings used for DFTF in CloudSim.
Here, various terms are used for the position of the Cats like
Xbest,d is the best position of a cat in d-dimensional space,
Xi,d is the position of Cati, Vi,d is the velocity of Ci. ri is a
random value between [0, 1], Ci is the acceleration coefficient
that extends the Cats velocity to move into solution space,
which is set to 2.0, and t is the iteration number. Mixture Ratio
(MR) is used to combine the two modes in the algorithm that
is seeking mode and tracing mode and to determine the ratio
of Cats in the modes. We have set the control variable MR
to 1%-3% that determines the position a Cat which is either
seeking or tracing mode. It means that at any instance, 10%
to 30% of the Cats are in tracing mode, and the rest of the
Cats are in seeking mode. Here local search refers to tracing
mode and global search refers to seeking mode. Cats spend
most of the time in resting mode (seeking mode), so, the MR
value should be a tiny value to show their behavior in the real
world. There is a need to put the control check on the velocity
of Cat for every dimension value so that velocities must be in
the range and have not crossed maximum. This control check to tracing mode, therefore, computational complexity from
is added through inertia weight (w) for which the optimum Lines 16 to 26, is O(N (ks)) and Line 27 takes time,
value must be between [0.4-0.9]. In our case, 0.7 is giving O(l). Thus, the overall time complexity of the proposed
the best results. For the first iteration, it needs to be started DFTF is O(N 3 + N 3 (n + ks + l) that simplifies to
at 0.9 and gradually decreased to 0.4. The optimum solution O N 3 + N 3 .n + N 3 .ks + N 3 .l after further simplification
will be available somewhere between the values. The seeking gives O(N 3 (1 + 1.n+1.ks + 1.l) ) and finally its O(N 3 ).
mode of the CSO technique is composed of the following
parameters: Seeking Memory Pool (SMP), Seeking Range TABLE 4. Description about datasets.
of selected Dimension (SRD), Counts of Dimension Change
(CDC), and Self-Position Consideration (SPC).
We have computed the evaluation of the DFTF model
based on time complexity. As our model follows a combined
approach that is the combination of SVM and CSO, so we
have used different parameters that are specified in evolution-
ary algorithms as well as classification algorithms. For CSO,
these parameters are given in Table 2. For CloudSim, we used
the parameter settings as reported in Table 3 These parameter
settings are chosen based on the convergence of the DFTF IV. EXPERIMENTAL SETUP
algorithm after conducting several experiments. In this section, various files in the form of datasets are pre-
From Algorithm 1, the computational time for Lines 1 to 8 sented including audio, video, text, and images which are
is O(N 3 (n)) and Lines 9 to 13 take constant time O (1). taken from UCI repository [99] and other sources. There is
Lines 14 and 15 take O(N 2 ). From experiments, it is a total of 60,000 datasets which are further divided equally as
observed that the seeking mode takes less time as compared given in Table 4.

173216 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

TABLE 5. Statistics about training and test sets. research include CBS-MKC, FSALB, PSO-BOOST, IACSO-
SVM, CSO-DA, and GA-ACO.
CBS-MKC used credit-based scheduling with an empha-
sis on task categorization but lacks a multi-factor appr-
oach. FSALB largely focused on reducing communication
delays experienced by the machine learning users and hence
improved response time but lacks a multi-factor approach,
PSO-BOOST considered deadline constraint within a limited
number of tasks, VMs and has shown improvements on few
parameters. IACSO-SVM worked on the classification accu-
racy of the limited number of tasks and datasets. CSO-DA
emphasized response time, number of migrations, and exe-
cution time on fewer tasks and VMs. GA-ACO improved
completion time, response time, and throughput under limited
Further, various dataset files are placed into training and resources. However, the proposed algorithm DFTF not only
testing mode with a ratio of 70:30, where 70% of data files are addresses these limitations but also adopted a multi-factor
training datasets and 30% are testing datasets as mentioned approach to solving.
in Table 5. Similarly, there are deep learning approaches which are
CloudSim 4.0 [100] as compared to other simulators is producing better results than traditional algorithms, but they
widely used in conducting and implementing cloud-related take more time in training with a large number of datasets.
research work. The simulator is providing on-demand Therefore, as per the requirement of our proposed work,
resources in virtualization form and has several advantages we have selected One-vs-Many SVM that has outperformed
such as flexibility, performance, and ease of use. A data center other classifiers in the first stage in terms of accuracy.
is configured with the region, architecture, operating system,
VM, memory, storage data transfer cost, and the number of A. ACCURACY OF DFTF
physical hardware units. In our case, we have set 500 and
Validation methods such as accuracy, precision, recall, and
1000 VMs, respectively, during experiments along with 4096,
F-measure are used to check the accuracy of the DFTF.
8192 MB of RAM and 2 TB of memory. All simulations
The classifiers such as ACO-SVM [102], Bayes Net [103],
are performed on Desktop PC comprising of MS Windows
J48 [103], and Multiclass [104] are used for comparative anal-
10 Operating System, Intel Quad-Core i7 with 2.6 GHz pro-
ysis as shown in Table 5. The results of DTFT are presented
cessor, 12 GB RAM, and 1 TB of HDD.
as comparative analysis over other algorithms in which DFTF
The algorithms used in this research include CBS-MKC,
has shown better performance in all validation methods.
FSALB, PSO-BOOST, IACSO-SVM, CSO-DA, GA-ACO,
The results of classification algorithms are validated
and proposed DFTF. There are eight metrics on which DFTF
using classification validation accuracy measures concerning
is compared, such as energy consumption, response time,
Accuracy, Precision, Recall, and F-Measure [105] reported
SLA violations, number of migrations, execution time, over-
in Table 6. The results of these classifiers are ranged between
head time, throughput time, and optimization time. There is
[0-1] with 1 being accurate classification. The more the
a total of 60,000 tasks on which evaluations are performed.
value closer to 1, the higher the accuracy of the classifier is
All algorithms are implemented in CloudSim 4.0 taken from
achieved. From Figure 3, DFTF has attained better accuracy
their respective research papers with the same configura-
than other classifiers.
tion and environmental setting to make the results reliable.
Further, the results are statistically verified through analysis TABLE 6. Comparative analysis of DFTF with classification techniques.
of student t-test to check their reliability that eliminates the
fact that the values are not by chance.

V. RESULTS AND ANALYSIS


In this section, the proposed algorithm has been divided into
two major parts. In the first part, the classification of the file’s
datasets is done using the SVM classifier in the cloud. In the
second part, the output of the SVM is fed into ICSO for load
balancing in the cloud environment. To obtain better, fast,
B. EVALUATION OF DFTF ON ENERGY CONSUMPTION
and accurate results, we have used the One-vs-All classifi-
cation approach that initially classifies the files datasets by Energy Consumption is calculated using the following
comparing it with all classes [101]. Overall, the output of proposed equation:
the SVM falls into one of the four major data classes such XN Xn  TE (VMN ) 
EC = , (6)
as audio, video, image, and text. The baselines used in this k=1 N =1 E(Tk )
VOLUME 8, 2020 173217
M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

FSALB with 16%, IACO-SVM with 15%, GA-ACO with


12%, PSO-Boost with 11% and DFTF only consumed 9%
energy. Preprocessing using a classification in cloud reduced
the computational complexity resulted in the least energy
consumption and further faster convergence of CSO achieved
the best solution in the minimum number of iterations which
also preserved the energy.

C. EVALUATION OF DFTF ON RESPONSE TIME


Response time is computed using the following proposed
equation
N
" #
X
RT = CTt − (TsTt − TeTt ) , (7)
FIGURE 3. Comparison of DTFT with other classifiers.
t=1
where, t:Task
RT : Response time
where, TE (VMN ): Total Energy consumed by VMN , where, CTt : Computational time for task t
N = 1, .., n TsTt : Task start time of t th task
E (Tk ) : Energy consumed for a particular task Tk TeTt : Task end time of t th task

FIGURE 4. Performance of DFTF and Baselines on Energy Consumption FIGURE 5. Performance of DFTF and Baselines on Response Time on VMs
on VMs (5-2000). (5-2000).

Figure 5 represents a response time against a varying num-


Energy consumption is depicted in Figure 4 in which ber of tasks and VMs for all algorithms. Comparative analysis
comparative analysis is performed for all baselines. From the of algorithms has not revealed much difference in response
very start, a significant difference is seen which continues till time initially as these algorithms are producing output in an
2000 VMs. Since the number of tasks and other configura- almost equal time. Further, with the increase in VMs and
tion parameters is set equal so that reliable results may be tasks, a surge is seen especially for the two baselines such
obtained. After a gradual increase in the number of tasks from as CBS-MKC and PSO-Boost taking more response time
1000 to 2000, few of the baselines such as CBS-MKC and after 1000 VMs. The other baselines such as IACO-SVM,
CSO-DA start to consume slightly more energy which is fur- GA-ACO, CSO-DA, and FSALB are also showing more
ther followed by a few other baselines. Likewise, the increase response time whereas the response time of DFTF remains
in VMs from 5 to 10, 50, 100, 500, 1000, and 2000 also results stable throughout.
in an increase in energy consumption which keeps on getting This shows that DFTF is scalable in a dynamic envi-
increased with every additional VMs. In 2000 VMs, most of ronment where the number of tasks and VMs are get-
the baselines suffer from huge energy consumption leaving ting increased. In other words, CBS-MKC has delivered
only DFTF with comparatively least consumption of energy. its response in 18% of total response time followed by
The progress of DFTF in this scenario remains quite PSO-Boost with 16%, IACO-SVM with 16%, CSODA with
smooth even with the addition of more VMs which shows 15%, FSALB with 14% and GA-ACO with 13%. However,
that DFTF is consistent. In other words, CBS-MKC has DFTF has only taken 8% response time and has outper-
consumed 21% energy followed by CSO-DA with 16%, formed all baselines. The total response time is the sum of

173218 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

all response times taken by all baselines. This is because


of CSO’s stronger convergence to a solution and achieving
global minima in fewer iterations.

D. EVALUATION OF DFTF ON NUMBER OF


SLA VIOLATIONS
SLA violation is calculated using the proposed equation (8):
 XN Ttotal 
SV = . (8)
i=1 Tvm
i

where,
Ttotal : Total CPU time
Tvmi : Time is taken by ith VM.

FIGURE 7. Performance of DFTF and Baselines on Migration time on


VMs (5-2000).

Figure 7 shows the migration time taken by all baselines


with varying tasks and VMs. More migrations take more
time as compared to fewer migrations. Optimization algo-
rithms make the least number of migrations because a single
migration involves several processes, their communication,
interrupts, addresses, and other factors. VM migration policy
is administered by the administrator in which a rule is defined
about when to trigger the VM migration from one host to
another host. Generally, a threshold is defined which enables
VM migrations while considering the computing capabilities
of the host machines to minimize the number of SLA viola-
FIGURE 6. Performance of DFTF and Baselines on SLA Violations on tions and migrations.
VMs (5-2000). Both heuristic and metaheuristic algorithms as a hybrid
approach work well for finding the solution in VM migration.
In this case, SLA violation occurs if VM takes more time For initial VM placements, the heuristic algorithm is suitable
than allotted CPU time. SLA violations against varying num- whereas, for optimization during migration, the metaheuristic
ber of tasks and VMs for all baselines are shown in Figure 6. algorithm performs well. This helps in reducing cost and solu-
It is observed that till 100 VMs, no major change in SLA tion space. Figure 7 initially shows IACO-SVM and CBS-
violations is seen in the graph which becomes more obvious MKC started by taking more migration time due to many
after 500 and above VMs. The algorithms such as PSO-Boost, migrations which gets better after 50 VMs till 1000 VMs and
CBS-MKC, and FSALB have violated more SLAs as soon as further increased afterward. This shows the unstable behavior
tasks and VMs are increased. DFTF, in this case, has done of these algorithms with varying VMs and tasks. In other
least violations that are only 8% as compared to PSO-Boost words, CBS-MKC has taken 23% migration time followed
with 19% violations followed by CBS-MKC with 17%, by IACO-SVM with 22%, FSLAB with 12%, PSO-Boost
FSALB with 16%, CSO-DA with 16%, IACO-SVM with with 12%, CSO-DA with 12%, GA-ACO with 11% and
11% and GA-ACO with 11% SLA violations. The least num- DFTF with only 8% migration time due to least number of
ber of SLA violations confirmed that DFTF has performed migrations and further supported by the least SLA violations.
fewer migrations and avoiding complexity.
F. EVALUATION OF DFTF ON OPTIMIZATION TIME
E. EVALUATION OF DFTF ON MIGRATION TIME
Optimization time is calculated using the following proposed
Migration time is calculated using the following function: equation:
Xn
Mt = TSk (VMi , VMi+1 ). (9) XImax
k=1 OT = Ti . (10)
i=0
where,
TSk (VMi , VMi+1 ) = TSk (VMi → VMi+1 ), scheduling time where,
is taken for allocating k th data from ith VM to (i + 1)th VM is T : Time taken for ith iteration
Pi Imax
based on availability. i=0 Ti : Total Time for complete iteration.

VOLUME 8, 2020 173219


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

FIGURE 9. Performance of DFTF and Baselines on Execution Time on VMs


FIGURE 8. Performance of DFTF and Baselines on Optimization Time on
(5-2000).
VMs (5-2000).

The optimization time also known as convergence time GA-ACO with 17%, FSALB with 16%, CSO-DA with 14%,
of all baselines is plotted in Figure 8. Only two algorithms IACSO-SVM with 13%, PSO-Boost with 12% and only 9%
FSALB and CBS-MKC are showing an exponential increase execution time taken by DFTF.
in optimization time resulting in comparatively higher It shows that the classification method using SVM plays
unstable behavior. The algorithms such as IACSO-SVM, an effective role in shortening the task execution time of the
GA-ACO, and CSO-DA deviate a little bit in optimizing DFTF and further establishes the stronger scheduling ability
the tasks because these algorithms get trapped into local of the algorithm.
optimum, whereas as PSO-Boost and DFTF optimize quite
H. EVALUATION OF DFTF ON THROUGHPUT TIME
fast and produce better results in the presence of other base-
lines. Overall, FSALB has taken much time in getting opti- Throughput time is calculated using the following proposed
mized which is 22% followed by CBBS-MKC with 17%, equation:
IACSO-SVM with 14%, CSO-DA with 12%, GA-ACO XN Tk
TTP = . . (12)
with 12%, PSO-Boost 12% and only 11% optimization time k=1 Tp
k
taken by DFTF. Cats move on a global scale to find the global where,
best position that prevents them to fall into global optima so, Tk : k th task
they tend to optimize the solution quite fast. Tpk : Time period for completing k th task
The throughput time of all baselines is shown in Figure 10.
G. EVALUATION OF DFTF ON EXECUTION TIME
Two algorithms CBS-MKC and IACSO-SVM have initially
Execution time is calculated using the following proposed taken more time in providing throughput which gets further
equation: increased to 100 VMs because these algorithms could not
XN
Et = Tt (Tk ). (11) quickly optimize. CSO-DA started better but surged after
k=1 the addition of 500 VMs because more tasks are adding
where, complexity in it.
Tt (Tk ) : Total time for executing k th task However, FSALB, GA-ACO, and DFTF have shown good
The execution time of all baselines is shown in Figure 9. throughput performance. Overall, IACSO-SVM has taken
Here, DFTF initially performed extremely well and took the much throughput time that is 19% followed by CSO-DA with
least time and then started to rise when VMs gets 50 in 18%, CBS-MKC with 16%, PSO-Boost with 13%, FSALB
size because DFTF initially converges slowly. However, not and GA-ACO with 12% each, and only 10% throughput time
huge improvement is observed in DFTF but overall, compar- taken by DFTF. Stronger robustness by DFTF has resulted in
atively better performance can be seen. The algorithms like generating solutions in minimum throughput time.
CBS-MKC and GA-ACO from the very start deviate a lot and
therefore take more execution time in almost all runs. FSALB I. EVALUATION OF DFTF ON OVERHEAD TIME
remains quite better with every increase in VMs whereas, Overhead time is calculated using the following proposed
PSO-Boost and DFTF execute quite fast and produced better equation:
results in the presence of other baselines. Overall, CBS-MKC XN
has taken much execution time that is 19% followed by OHT = (Tott (Ti ) − Et (Ti )). (13)
i=1

173220 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

J. STATISTICAL ANALYSIS
We have checked the resulting values of all parameters and
found their distribution is normal. In that case, there is a need
for a parametric test that involves 2 variables because we
have taken one baseline at a time and compare it with DFTF.
In statistics, the suitable test for 2 variables with normal
distribution is student t-test. Similarly, we can see in Table 7
the values such as mean, standard deviation (SD), p-value,
and t-value. Meanwhile, the significance level is set to
p< 0.05 [106]. At this stage, we need to define the hypothesis
in the following manner:
H0: DFTF and other baselines have no difference.
H1: A significant difference exists between DFTF and
other baselines.
We can see that p-values in all cases are less than the
significance level that is <0.05 which means that the signif-
FIGURE 10. Performance of DFTF and Baselines on Throughput Time on icant difference exists among the values of DFTF and other
VMs (5-2000). baselines. So, we are right to reject the null hypothesis and
accept the alternate hypothesis. Similarly, we can say that a
significant difference exists in terms of energy consumption,
response time, SLA violations, migration time, optimization
time, execution time, throughput time, and overhead time.

K. RANKING BASELINES
Table 8 shows eight Quality of Service (QoS) metrics used in
this study against seven baselines.

FIGURE 11. Performance of DFTF and Baselines on Overhead Time on


VMs (5-2000).

where,
Tott (Ti ) : Total time required for executing ith task
Et (Ti ) : Execution time of ith task
The overhead time of all baselines is shown in Figure 11.
Two algorithms CSO-DA and PSO-BOOST started with
huge overhead time which gets stable at 50 VMs but again
FIGURE 12. Comparative analysis of all baselines on various parameters.
instability is observed after 500 VMs which gets increased
after every run. This is mainly because of their computa- It can be observed that certain baselines perform better
tional complexity. Overall, on average, CSO-DA has taken in one scenario and average or worst in another scenario
more overhead time that is 18% followed by CBS-MKC but proposed DFTF performed better among them followed
and PSO-BOOST with 17% each, IACSO-SVM with 15%, by PSO-BOOST, GA-ACO, FSALB, IACO-SVM, CSO-DA,
PSO-Boost with 13%, FSALB and GA-ACO with 13% each, and CBS-MKC, respectively. Figure 12 shows the averaged
and only 7% throughput time taken by DFTF. performance of all baselines in terms of energy efficiency,
The minimal computational complexity, fewer iterations in response time, SLA violations, migration time, and optimiza-
finding global optima, low communication cost, low over- tion time over varying tasks and VMs. It is shown that overall
head, and better convergence has made DFTF a better choice DFTF has outperformed in all five-performance metrics in
over other baselines. the presence of other baselines.

VOLUME 8, 2020 173221


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

TABLE 7. Statistical comparison of DFTF with other baselines.

173222 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

TABLE 8. Ranking of all baselines.

VI. CONCLUSION [3] J. Zhao, K. Yang, X. Wei, Y. Ding, L. Hu, and G. Xu, ‘‘A heuristic
The impact of file type format classification has made signif- clustering-based task deployment approach for load balancing using
bayes theorem in cloud environment,’’ IEEE Trans. Parallel Distrib. Syst.,
icant contributions to cloud computing. We have proposed a vol. 27, no. 2, pp. 305–316, Feb. 2016.
DFTF approach that achieves better results in load balancing. [4] A. Nadjaran Toosi, C. Qu, M. D. de Assunção, and R. Buyya,
In the conducted study, DFTF is developed in two steps. ‘‘Renewable-aware geographical load balancing of Web applications for
sustainable data centers,’’ J. Netw. Comput. Appl., vol. 83, pp. 155–168,
In the first step, file type classification is done in various Apr. 2017.
formats such as video, audio, text, and images in a cloud [5] S. S. Patil and A. N. Gopal, ‘‘Dynamic load balancing using periodically
environment resulting in an appropriate data class. In our load collection with past experience policy on linux cluster system,’’
case, we have used four data classes in which appropriate Amer. J. Math. Comput., vol. 2, no. 2, pp. 60–75, 2017.
[6] S. Mohanty, P. K. Patra, M. Ray, and S. Mohapatra, ‘‘A novel meta-
file format falls. A total of 60,000 datasets/data files are heuristic approach for load balancing in cloud computing,’’ Int. J. Knowl.-
collected from different sources and placed in the cloud for Based Organizations, vol. 8, no. 1, pp. 29–49, Jan. 2018.
classification. Classification is performed using SVM one to [7] K.-M. Cho, P.-W. Tsai, C.-W. Tsai, and C.-S. Yang, ‘‘A hybrid meta-
heuristic algorithm for VM scheduling with load balancing in cloud com-
many classification approaches providing the best accuracy puting,’’ Neural Comput. Appl., vol. 26, no. 6, pp. 1297–1309, Aug. 2015.
among other classifiers such as Multiclass, J48, Bayes Net, [8] M. Junaid, A. Sohail, A. Ahmed, A. Baz, I. A. Khan, and H. Alhakami,
and ACO-SVM. In the second step, the resultant data class is ‘‘A hybrid model for load balancing in cloud using file type formatting,’’
fed into a CSO which performs load balancing in an efficient IEEE Access, vol. 8, pp. 118135–118155, 2020.
[9] L. Heilig, R. Buyya, and S. Voß, ‘‘Location-aware brokering for con-
manner. In CSO, we have introduced the grouping phase sumers in multi-cloud computing environments,’’ J. Netw. Comput. Appl.,
which divides the data files into four groups’ audio, video, vol. 95, pp. 79–93, Oct. 2017.
image, and text. The offline preprocessing in the cloud for [10] A. Kaur, B. Kaur, and D. Singh, ‘‘Comparative analysis of metaheuristics
classification helps in reducing the computational complexity based load balancing optimization in cloud environment,’’ in Smart and
Innovative Trends in Next Generation Computing Technologies (Com-
and increases the efficiency in load balancing. Furthermore, munications in Computer and Information Science), vol. 827. Singapore:
the validation of DFTF is established through QoS evalua- Springer, 2018, pp. 30–46.
tion metrics in terms of energy consumption, response time, [11] P. Kumar and R. Kumar, ‘‘Issues and challenges of load balancing tech-
niques in cloud computing: A survey,’’ ACM Comput. Surv., vol. 51, no. 6,
SLA violations, migration time, execution time, throughput p. 120, 2019.
time, overhead time, and optimization time. DFTF due to [12] W. Gai, C. Qu, J. Liu, and J. Zhang, ‘‘A novel hybrid meta-heuristic
its hybrid nature has taken the relative advantages of SVM algorithm for optimization problems,’’ Syst. Sci. Control Eng., vol. 6,
no. 3, pp. 64–73, Sep. 2018.
and ICSO which helps in achieving better performance in
[13] P. Kaur and P. D. Kaur, ‘‘Efficient and enhanced load balancing algo-
the presence of baselines such as CBS-MKC, FSALB, PSO- rithms in cloud computing,’’ Int. J. Grid Distrib. Comput., vol. 8, no. 2,
BOOST, IACSO-SVM, CSO-DA, and GA-ACO. pp. 9–14, Apr. 2015.
The proposed approach is a multi-factor approach that [14] C. Gomez, A. Shami, and X. Wang, ‘‘Machine learning aided scheme for
load balancing in dense IoT networks,’’ Sensors, vol. 18, no. 11, p. 3779,
ultimately saves time, cost, and valuable resources. It also Nov. 2018.
improves the scalability, and robustness in the cloud environ- [15] X. Sui, D. Liu, L. Li, H. Wang, and H. Yang, ‘‘Virtual machine schedul-
ment. In the future, we will perform load balancing in the ing strategy based on machine learning algorithms for load balancing,’’
EURASIP J. Wireless Commun. Netw., vol. 2019, no. 1, p. 160, Dec. 2019.
cloud by considering other sensitive parameters like deadline
[16] B. Tang, Y. Li, X. Li, L. Xu, Y. Yan, and Q. Yang, ‘‘Deep CNN framework
constraints, priority-based scheduling, and task immigrations for environmental sound classification using weighting filters,’’ in Proc.
using deep learning approaches. IEEE Int. Conf. Mechatronics Autom. (ICMA), Tianjin, China, Aug. 2019,
pp. 2297–2302.
[17] A. Zakaria, R. Rizal, and O. Dwi, ‘‘Particle swarm optimization and
REFERENCES support vector machine for vehicle type classification in video stream,’’
[1] M. Sheikhalishahi, R. M. Wallace, L. Grandinetti, J. L. Vazquez-Poletti, Int. J. Comput. Appl., vol. 182, no. 18, pp. 9–13, Sep. 2018.
and F. Guerriero, ‘‘A multi-dimensional job scheduling,’’ Future Gener. [18] Y. F. Huang and S. H. Wang, ‘‘Movie genre classification using SVM with
Comput. Syst., vol. 54, pp. 123–131, Jan. 2016. audio and video features,’’ in Active Media Technology (Lecture Notes
[2] T. Carli, S. Henriot, J. Cohen, and J. Tomasik, ‘‘A packing problem in Computer Science), vol. 7669, R. Huang, A. A. Ghorbani, G. Pasi,
approach to energy-aware load distribution in Clouds,’’ Sustain. Comput., T. Yamaguchi, N. Y. Yen, and B. Jin, Eds. Berlin, Germany: Springer,
Inform. Syst., vol. 9, pp. 30–32, Mar. 2016. 2012.

VOLUME 8, 2020 173223


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

[19] K. M. Salama and A. M. Abdelbar, ‘‘Learning neural network structures [41] S. Kumar Mishra, S. Bibhudatta, and P. P. Parida, ‘‘Load balancing in
with ant colony algorithms,’’ Swarm Intell., vol. 9, no. 4, pp. 229–265, cloud computing: A big picture,’’ J. King Saud Univ.-Comput. Inf. Sci.,
Dec. 2015. vol. 32, no. 2, pp. 149–158, 2020.
[20] L. Jiao and L. Feng, ‘‘Text classification based on ant colony optimiza- [42] R. Shaikh and M. Sasikumar, ‘‘Data classification for achieving secu-
tion,’’ in Proc. 3rd Int. Conf. Inf. Comput., Jun. 2010, pp. 229–232. rity in cloud computing,’’ Procedia Comput. Sci., vol. 45, pp. 493–498,
[21] Q. Wang, R. Peng, J. Wang, Y. Xie, and Y. Zhou, ‘‘Research on text 2015.
classification method of LDA- SVM based on PSO optimization,’’ [43] H. B. Barua and K. C. Mondal, ‘‘A comprehensive survey on cloud
in Proc. Chin. Autom. Congr. (CAC), Hangzhou, China, Nov. 2019, data mining (CDM) frameworks and algorithms,’’ ACM Comput. Surv.,
pp. 1974–1978. vol. 52, no. 5, pp. 1–62, Oct. 2019.
[22] P. Adriana, L. Veronica, P. R. Pasquale, and I. Sidhu, ‘‘A genetic algorithm [44] H. Song and J. G. Lee, ‘‘RP-DBSCAN: A superfast parallel DBSCAN
for text classification rule induction,’’ in Proc. Joint Eur. Conf. Mach. algorithm based on random partitioning,’’ in Proc. Int. Conf. Manage.
Learn. Knowl. Discovery Databases. Berlin, Germany: Springer, 2008, Data, Houston, TX, 2018, pp. 1173–1187.
pp. 188–203. [45] R. Jin, C. Kou, R. Liu, and Y. Li, ‘‘Efficient parallel spectral cluster-
[23] H. Hasanpour, R. Ghavamizadeh Meibodi, and K. Navi, ‘‘Improving rule- ing algorithm design for large data sets under cloud computing envi-
based classification using harmony search,’’ PeerJ Comput. Sci., vol. 5, ronment,’’ J. Cloud Comput., Adv., Syst. Appl., vol. 2, no. 1, p. 18,
p. e188, Nov. 2019. 2013.
[24] F. Yigit and O. K. Baykan, ‘‘A new feature selection method for text [46] J. Chen, K. Li, Z. Tang, K. Bilal, S. Yu, C. Weng, and K. Li, ‘‘A parallel
categorization based on information gain and particle swarm optimiza- random forest algorithm for big data in a spark cloud computing environ-
tion,’’ in Proc. IEEE 3rd Int. Conf. Cloud Comput. Intell. Syst., Nov. 2014, ment,’’ IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 4, pp. 919–933,
pp. 523–529. Apr. 2017.
[25] H. Peng, C. Ying, S. Tan, B. Hu, and Z. Sun, ‘‘An improved feature [47] F. Ozgur Catak and M. Erdal Balaban, ‘‘CloudSVM: Training an
selection algorithm based on ant colony optimization,’’ IEEE Access, SVM classifier in cloud computing systems,’’ in Proc. Joint Int. Conf.
vol. 6, pp. 69203–69209, 2018. Pervasive Comput. Netw. World. Berlin, Germany: Springer, 2012,
[26] C. López-Franco, L. Villavicencio, N. Arana-Daniel, and A. Y. Alanis, pp. 57–68.
‘‘Image classification using PSO-SVM and an RGB-D sensor,’’ Math. [48] B. Apexa Kamdar and M. Jay Jagani, ‘‘A survey: Classification of huge
Problems Eng., vol. 2014, Jul. 2014, Art. no. 695910. cloud datasets with efficient map-reduce policy,’’ Int. J. Eng. Trends
[27] C. Sukawattanavijit, J. Chen, and H. Zhang, ‘‘GA-SVM algorithm for Technol. (IJETT), vol. 18, no. 2, pp. 103–107, 2014.
improving land-cover classification using SAR and optical remote sens- [49] D. Apiletti, E. Baralis, T. Cerquitelli, P. Garza, P. Michiardi, and
ing data,’’ IEEE Geosci. Remote Sens. Lett., vol. 14, no. 3, pp. 284–288, F. Pulvirenti, ‘‘PaMPa-HD: A parallel MapReduce-based frequent pattern
Mar. 2017. miner for high-dimensional data,’’ in Proc. IEEE Int. Conf. Data Mining
[28] V. Pallavi and V. Vaithiyanathan, ‘‘Combined artificial neural network Workshop (ICDMW), Atlantic City, NJ, USA, Nov. 2015, pp. 839–846.
and genetic algorithm for cloud classification,’’ Int. J. Eng. Res. Technol., [50] I. Strumberger, N. Bacanin, M. Tuba, and E. Tuba, ‘‘Resource scheduling
vol. 5, pp. 787–794, Apr. 2013. in cloud computing based on a hybridized whale optimization algorithm,’’
[29] Data Preprocessing for Machine Learning: Options and Recommen- Appl. Sci., vol. 9, no. 22, p. 4893, Nov. 2019.
dations. Accessed: May 17, 2020. [Online]. Available: https://cloud. [51] D. Pebrianti, A. Nurnajmin, B. Luhur, A. Nor Rul Hasma, Z. Zainah, and
google.com/solu tions/machine-learning/data-preprocessing-for-ml- I. Riyanto, ‘‘Extended bat algorithm (EBA) as an improved searching
with-tf-transform-pt2 optimization algorithm,’’ in Proc. 10th Nat. Tech. Seminar Underwater
[30] S. C. Chu, P. W. Tsai, and J. S. Pan, ‘‘Cat swarm optimization,’’ in PRICAI Syst. Technol. (NUSYS). Singapore: Springer, 2018.
2006: Trends in Artificial Intelligence (Lecture Notes in Computer Sci- [52] D. Chaudhary and B. Kumar, ‘‘Cloudy GSA for load scheduling
ence), vol. 4099, Q. Yang and G. Webb, Eds. Berlin, Germany: Springer, in cloud computing,’’ Appl. Soft Comput., vol. 71, pp. 861–871,
2006. Oct. 2018.
[31] A. M. Ahmed, T. A. Rashid, and S. A. M. Saeed, ‘‘Cat swarm optimiza- [53] S. Torabi and F. Safi-Esfahani, ‘‘A dynamic task scheduling framework
tion algorithm: A survey and performance evaluation,’’ Comput. Intell. based on chicken swarm and improved raven roosting optimization meth-
Neurosci., vol. 2020, Jan. 2020, 4854895. ods in cloud computing,’’ J. Supercomput., vol. 74, no. 6, pp. 2581–2626,
[32] C. E. Klein, L. dos Santos Coelho, Â. M. O. Sant’Anna R. Z. Freire, and Jun. 2018.
V. C. Mariani, ‘‘Improved cat swarm optimization approach applied to [54] A. Al-Hamodi, S. Lu, and Y. Al-Salhi, ‘‘An enhanced frequent pat-
reliability-redundancy problem,’’ in Proc. 22nd Eur. Symp. Artif. Neural tern growth based on MapReduce for mining association rules,’’ Int.
Netw. (ESANN), Bruges, Belgium, Apr. 2014, pp. 1–6. J. Data Mining Knowl. Manage. Process, vol. 6, no. 2, pp. 19–28,
[33] (Jan. 1, 2016). Using a PostgreSQL Database as a Source for AWS DMS. 2016.
Accessed: May 9, 2020. [Online]. Available:https://docs.aws.amazon. [55] F.-H. Tseng, X. Wang, L.-D. Chou, H.-C. Chao, and V. C. M. Leung,
com/dms/latest/userguide/CHAP_Source.PostgreSQL.html ‘‘Dynamic resource prediction and allocation for cloud data center using
[34] S. Vrajesh and M. Bala, ‘‘An improved task allocation strategy in cloud the multiobjective genetic algorithm,’’ IEEE Syst. J., vol. 12, no. 2,
using modified K-means clustering technique,’’ Egyptian Inform. J., pp. 1688–1699, Jun. 2018.
vol. 4, pp. 1–8, 2020. [56] D. Soni, A. Mishra, and H. Gupta, ‘‘An efficient cloud data mining (CDM)
[35] K. Sekaran, M. S. Khan, R. Patan, A. H. Gandomi, P. V. Krishna, and algorithm for frequent pattern mining in cloud computing environment,’’
S. Kallam, ‘‘Improving the response time of M-Learning and cloud com- Lect. Notes Softw. Eng, vol. 4, no. 3, pp. 234–237, 2016.
puting environments using a dominant firefly approach,’’ IEEE Access, [57] M. S. Sudheer and M. Dr Vamsi Krishna 2019, ‘‘Dynamic PSO for task
vol. 7, pp. 30203–30212, 2019. scheduling optimization in cloud computing,’’ Int. J. Recent Technol.
[36] M. Kumar and S. C. Sharma, ‘‘PSO-based novel resource scheduling tech- Eng., vol. 8, no. 2S11, pp. 3559–3589, 2019.
nique to improve QoS parameters in cloud computing,’’ Neural Comput. [58] K. Mangayarkkarasi and M. Chidambaram, ‘‘An intelligent service rec-
Appl., vol. 194, pp. 1–24, Jun. 2019. ommendation model for service usage pattern discovery in secure cloud
[37] S. Rongali and R. Yalavarthi, ‘‘An improved ant colony optimization for computing environment,’’ J. Theor. Appl. Inf. Technol., vol. 95, no. 12,
parameter optimization using support vector machine,’’ Int. J. Eng. Adv. pp. 3500–3512, 2017.
Technol. (IJEAT), vol. 6, no. 3, pp. 198–204, 2017. [59] A. Bouzidi, M. E. Riffi, and M. Barkatou, ‘‘Cat swarm optimization for
[38] A. Pourghaffari and M. Barar, ‘‘Workflow scheduling in cloud comput- solving the open shop scheduling problem,’’ J. Ind. Eng. Int., vol. 15,
ing environment using hybrid CSO-DA,’’ Int. J. Nonlinear Anal. Appl., no. 2, pp. 367–378, Jun. 2019.
vol. 10, no. 2, pp. 177–188, 2019. [60] M. Kumar, SC. Sharma, ‘‘Dynamic load balancing algorithm for balanc-
[39] A. M. Senthil Kumar and M. Venkatesan, ‘‘Multi-objective task schedul- ing the workload among virtual machine in cloud computing,’’ in Proc.
ing using hybrid genetic-ant colony optimization algorithm in cloud 7th Int. Conf. Adv. Comput. Commun. (ICACC), Cochin, India, Aug. 2017,
environment,’’ Wireless Pers. Commun., vol. 107, no. 4, pp. 1835–1848, pp. 322–329.
Aug. 2019. [61] L. Zhou and X. Wang, ‘‘Research of the FP-growth algorithm based
[40] A. Thakur and M. S. Goraya, ‘‘A taxonomic survey on load balancing in on cloud environments,’’ J. Software, vol. 9, no. 3, pp. 676–683,
cloud,’’ J. Netw. Comput. Appl., vol. 98, pp. 43–57, Nov. 2017. 2014.

173224 VOLUME 8, 2020


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

[62] A. K. Maurya and A. K. Tripathi, ‘‘Deadline-constrained algorithms for [83] J. Meena, M. Kumar, and M. Vardhan, ‘‘Cost effective genetic algo-
scheduling of bag-of-tasks and workflows in cloud computing environ- rithm for work?ow scheduling in cloud under deadline constraint,’’ IEEE
ments,’’ in Proc. 2nd Int. Conf. High Perform. Compilation, Comput. Access, vol. 4, pp. 5065–5082, 2016.
Commun. (HP3C), Hong Kong, Mar. 2018, pp. 6–10. [84] A. Nazia and D. Huifang, ‘‘A hybrid metaheuristic for multi-objective
[63] R. Rautray and R. C. Balabantaray, ‘‘Cat swarm optimization based scientific workflow scheduling in a cloud environment,’’ Appl. Sci., vol. 8,
evolutionary framework for multi document summarization,’’ Phys. A, no. 4, p. 538, 2018.
Stat. Mech. Appl., vol. 477, pp. 174–186, Jul. 2017. [85] W. Zhong, Y. Zhuang, J. Sun, and J. Gu, ‘‘A load prediction model
[64] M. Meyer, J. Beutel, and L. Thiele, ‘‘Unsupervised feature learning for cloud computing using PSO-based weighted wavelet support vec-
for audio analysis,’’ in Proc. 5th Int. Conf. Learn. Represent. (ICLR), tor machine,’’ Int. J. Speech Technol., vol. 48, no. 11, pp. 4072–4083,
Workshop Track, Toulon, France, 2017, pp. 1–4. Nov. 2018.
[65] G. Danlami, A. S. Ismail, A. Zainal, Z. Zakaria, A. Abraham, and [86] M. Ashouraei, S. N. Khezr, R. Benlamri, and N. J. Navimipour, ‘‘A new
N. M. Dankolo, ‘‘Cloud customers service selection scheme based on SLA-aware load balancing method in the cloud using an improved paral-
improved conventional cat swarm optimization,’’ Neural Comput. Appl., lel task scheduling algorithm,’’ in Proc. IEEE 6th Int. Conf. Future Inter-
vol. 6, pp. 1–22, 2020. net Things Cloud (FiCloud), Barcelona, Spain, Aug. 2018, pp. 71–76.
[66] P. Bajare, M. Bhoyate, Y. Bhujbal, E. Monika, and V. Shinde, ‘‘k-nearest [87] N. Sharma and S. Maurya, ‘‘SLA-based agile VM management in cloud
neighbor classification over encrypted cloud data,’’ IOSR J. Comput. Eng. & datacenter,’’ in Proc. Int. Conf. Mach. Learn., Big Data, Cloud Parallel
(IOSR-JCE), pp. 45–48, 2015. Comput. (COMITCon)), Faridabad, India, Feb. 2019, pp. 252–257.
[67] D. Gabi, A. S. Ismail, A. Zainal, Z. Zakaria, and A. Abraham, ‘‘Orthog- [88] A. Kumar and S. Bawa, ‘‘A comparative review of meta-heuristic
onal taguchi-based cat algorithm for solving task scheduling problem in approaches to optimize the SLA violation costs for dynamic execution of
cloud computing,’’ Neural Comput. Appl., vol. 30, no. 6, pp. 1845–1863, cloud services,’’ Soft Comput., vol. 24, no. 6, pp. 3909–3922, Mar. 2020.
Sep. 2018. [89] X. Song, Y. Ma, and D. Teng, ‘‘A load balancing scheme using federate
[68] K. Liu and J. Boehm, ‘‘Classification of big point cloud data using cloud migration based on virtual machines for cloud simulations,’’ Math. Prob-
computing,’’ ISPRS-Int. Arch. Photogramm., Remote Sens. Spatial Inf. lems Eng., vol. 2015, Mar. 2015, Art. no. 506432.
Sci., vol. 40, no. 3, p. 553, 2015. [90] J. Rouzaud-Cornabas, ‘‘A distributed and collaborative dynamic load bal-
[69] L. Zuo, L. Shu, S. Dong, C. Zhu, and T. Hara, ‘‘A multi-objective ancer for virtual machine,’’ in Proc. Eur. Conf. Parallel Process. (ECPP),
optimization scheduling method based on the ant colony algorithm in Ischia, Italy, 2010, pp. 641–648.
cloud computing,’’ IEEE Access, vol. 3, pp. 2687–2699, 2015. [91] T. Jamal and A. Enrique, ‘‘Metaheuristics for energy-efficient data rout-
[70] K.-C. Lin, K.-Y. Zhang, Y.-H. Huang, J. C. Hung, and N. Yen, ‘‘Feature ing in vehicular networks,’’ Int. J. Metaheuristics, vol. 4, no. 1, pp. 27–56,
selection based on an improved cat swarm optimization algorithm for 2015.
big data classification,’’ J. Supercomput., vol. 72, no. 8, pp. 3210–3221, [92] K. Saleem and N. Fisal, ‘‘Enhanced ant colony algorithm for self-
Aug. 2016. optimized data assured routing in wireless sensor networks,’’ in Proc. 18th
IEEE Int. Conf. Netw. (ICON), Dec. 2012, pp. 422–427.
[71] R. Latif, H. Abbas, S. Latif, and A. Masood, ‘‘EVFDT: An enhanced
[93] L. Vu and G. Alaghband, ‘‘A load balancing parallel method for frequent
very fast decision tree algorithm for detecting distributed denial of service
pattern mining on multi-core cluster,’’ in Proc. Symp. High Perform.
attack in cloud-assisted wireless body area network,’’ Mobile Inf. Syst.,
Comput., San Diego, CA, USA, 2015, pp. 49–58.
vol. 2015, pp. 1–13, 2015, 260594.
[94] N. Susila, ‘‘An efficient load balancing approach for energy aware cloud
[72] D. Gabi, A. S. Ismail, and N. M. Dankolo, ‘‘Minimized makespan
environment,’’ Ph.D. dissertation, Dept. Inf. Commun. Eng., Anna Univ.,
based improved cat swarm optimization for efficient task scheduling in
Chennai, India, 2017.
cloud datacenter,’’ in Proc. 3rd High Perform. Comput. Cluster Technol.
[95] R. Khorsand and M. Ramezanpour, ‘‘An energy-efficient task-scheduling
Conf. (HPCCT), New York, NY, USA, Jun. 2019, pp. 16–20.
algorithm based on a multi-criteria decision-making method in cloud
[73] C. Bae, N. Wahid, Y. Y. Chung, and W. C. Yeh, ‘‘Effective audio classifi-
computing,’’ Int. J. Commun. Syst., vol. 33, no. 9, p. e4379, 2020.
cation algorithm swarm-based optimization,’’ Int. J. Innov. Comput., Inf.
[96] R. M. Alguliyev, Y. N. Imamverdiyev, and F. J. Abdullayeva, ‘‘PSO-based
Control, vol. 10, no. 1, pp. 151–167, 2014.
load balancing method in cloud computing,’’ Autom. Control Comput.
[74] B. Panchal and R. K. Kapoor, ‘‘Performance enhancement of cloud Sci., vol. 53, no. 1, pp. 45–55, Jan. 2019.
computing with clustering,’’ Int. J. Eng. Adv. Technol., vol. 14, no. 6, [97] E. Rafieyan, R. Khorsand, and M. Ramezanpour, ‘‘An adaptive schedul-
pp. 37–40, 2014. ing approach based on integrated best-worst and VIKOR for cloud com-
[75] D. Gabi, A. S. Ismail, A. Zainal, Z. Zakaria, and A. Al-Khasawneh, puting,’’ Comput. Ind. Eng., vol. 140, Feb. 2020, Art. no. 106272.
‘‘Hybrid cat swarm optimization and simulated annealing for dynamic [98] I. Mierswa and K. Morik, ‘‘Automatic feature extraction for classifying
task scheduling on cloud computing environment,’’ J. Inf. Commun. audio data,’’ Mach. Learn., vol. 58, nos. 2–3, pp. 127–149, Feb. 2005.
Technol., vol. 17, no. 3, pp. 435–467, Jun. 2018. [99] (2010). UCI Machine Learning Repository. Accessed: May 20, 2020.
[76] D. Danilo, G. Fenu, M. Marras, and D. R. Recupero, ‘‘Bridging learning [Online]. Available: http://archive.ics.uci.edu/ml
analytics and cognitive computing for big data classification in micro- [100] R. N. Calheiros, R. Ranjan, C. A. F. De Rose, and R. Buyya, ‘‘CloudSim:
learning video collections,’’ Comput. Hum. Behav., vol. 92, pp. 468–477, A novel framework for modeling and simulation of cloud computing
2019. infrastructures and services,’’ 2009, arXiv:0903.2525. [Online]. Avail-
[77] C. Sauvanaud, G. Silvestre, M. Kaaniche, and K. Kanoun, ‘‘Data able: https://arxiv.org/abs/0903.2525
stream clustering for online anomaly detection in cloud applications,’’ [101] D. Kai-Bo, C. R. Jagath, and M. N. Nguyen, ‘‘One-versus-one and one-
in Proc. 11th Eur. Dependable Comput. Conf. (EDCC), Sep. 2015, versus-all multiclass SVM-RFE for gene selection in cancer classifica-
pp. 120–131. tion,’’ in Proc. Eur. Conf. Evol. Comput., Mach. Learn. Data Mining
[78] L. Tu and Y. Chen, ‘‘Stream data clustering based on grid density and Bioinf. Berlin, Germany: Springer, 2007, pp. 47–56.
attraction,’’ ACM Trans. Knowl. Discovery Data, vol. 3, no. 3, pp. 1–27, [102] R. Karthika and P. Visalakshi, ‘‘A hybrid ACO based feature selection
Jul. 2009. method for email spam classification,’’ WSEAS Trans. Comput., vol. 14,
[79] Y. Kumar and P. K. Singh, ‘‘Improved cat swarm optimization algo- pp. 171–177, 2015.
rithm for solving global optimization problems and its application to [103] B. Çiğşar, D. Ünal, ‘‘Comparison of data mining classification algo-
clustering,’’ Int. J. Speech Technol., vol. 48, no. 9, pp. 2681–2697, rithms determining the default risk,’’ Sci. Program., vol. 2019, Feb. 2019,
Sep. 2018. Art. no. 8706505.
[80] P. Bisht and K. Singh, ‘‘Big data mining: Analysis of genetic K-means [104] P. Kanu, J. Vala, and P. Jaymit, ‘‘Comparison of various classification
algorithm for big data clustering,’’ Int. J. Adv. Res. Comput. Sci. Softw. algorithms on iris datasets using WEKA,’’ Int. J. Advance Eng. Res.
Eng., vol. 6, no. 7, pp. 223–228, 2016. Develop., vol. 1, pp. 1–7, Feb. 2014.
[81] J. Zgraja and M. Woniak, ‘‘Drifted data stream clustering based on [105] B. Desgraupes, ‘‘Clustering indices,’’ Ouest-Lab Modal’X, Univ. Paris,
ClusTree algorithm,’’ in Proc. Int. Conf. Hybrid Artif. Intell. Syst. Cham, Paris, France, Tech. Rep., 2013, pp. 1–34.
Switzerland: Springer, 2018, pp. 338–349. [106] M. Ghobaei-Arani, A. A. Rahmanian, A. Rahmanian, A. Souri, and
[82] K. Dubey, M. Kumar, and S. C. Sharma, ‘‘Modified HEFT algorithm for A. M. Rahmani, ‘‘A moth-flame optimization algorithm for Web service
task scheduling in cloud environment,’’ Procedia Comput. Sci., vol. 125, composition in cloud computing: Simulation and verification,’’ Softw.,
pp. 725–732, 2018. Pract. Exper., vol. 48, no. 10, pp. 1865–1892, 2018.

VOLUME 8, 2020 173225


M. Junaid et al.: Modeling an Optimized Approach for Load Balancing in Cloud

MUHAMMAD JUNAID is currently pursuing OSMAN KHALID received the master’s degree
the Ph.D. degree with Iqra University, Islamabad, from the Center for Advanced Studies in Engi-
Pakistan. His research interests include cloud com- neering and the Ph.D. degree from North Dakota
puting, blended learning, machine learning, swarm State University, USA. He is currently an Assis-
intelligence, and information security. tant Professor with COMSATS University Islam-
abad, Abbottabad. His research interests include
recommender systems, network routing protocols,
the Internet of Things, and fog computing.

ADNAN SOHAIL received the master’s degree


in computer science from Bahria University, IMRAN ALI KHAN received the master’s degree
Islamabad, Pakistan, and the Ph.D. degree in from Gomal University, Dera Ismail Khan,
electrical engineering and information technol- Pakistan, and the Ph.D. degree from the Grad-
ogy from the Institute of Telecommunications, uate University Chinese Academy of Sciences,
Vienna University of Technology, Vienna, Austria. China. He is currently an Associate Profes-
He was an Assistant Professor with CU and IIUI, sor with the Department of Computer Science,
Pakistan. He is currently an Assistant Professor COMSATS University Islamabad, Abbottabad,
with the Computing and Technology Department, Pakistan. He has produced over 50 publications
Iqra University, Islamabad. His main research in journal of the international repute and pre-
interests include performance modeling and analysis of communication sented papers in the international conferences.
networks, cloud computing, optimization techniques, artificial intelligence, His research interests include wired and wireless networks and distributed
and agent-based computing. systems.

RAO NAVEED BIN RAIS received the M.S. SYED SAJID HUSSAIN received the Master
and Ph.D. degrees in computer engineering (net- of Science degree in computer science from
works and distributed systems) from the Univer- COMSATS University Islamabad (CUI), Pakistan,
sity of Nice Sophia Antipolis, France, in 2007 and in 2007, and the Ph.D. degree from the Fern
2011, respectively. He has experience of more than Universität in Hagen, Germany, in 2013. He is
15 years in teaching, research, and industrial devel- currently an Assistant Professor with CUI, Abbot-
opment. He is currently an Associate Professor tabad. His research interests include collaborative
with the Department of Electrical and Computer computing, human-computer interaction, and dis-
Engineering, College of Engineering and Informa- tributed systems.
tion Technology, Ajman University, United Arab
Emirates. His research interests include network protocols and architectures,
information-centric and software-defined networks, network virtualization,
machine learning, internet naming, and addressing issues.

NAVEED EJAZ received the B.S. degree in com-


puter sciences from the National University of
ADEEL AHMED (Graduate Student Member, Computer and Emerging Sciences, Islamabad,
IEEE) received the M.Phil. degree in computer sci- Pakistan, the M.S. degree in computer sciences
ence from Quaid-i-Azam University, Islamabad, from the National University of Sciences and
Pakistan, in 2011. He was with software industry Technology, Islamabad, and the Ph.D. degree in
for few years. He is currently a Ph.D. Scholar digital contents engineering from Sejong Univer-
with the Department of Computer Science, Quaid- sity, South Korea, in August 2013. He was a Fac-
i-Azam University. He is also with the Faculty ulty Member with reputed Universities in Pakistan
of Information Technology, The University of and Saudi Arabia. His research interests include
Haripur, Pakistan. His research interests include video summarization, video tagging, object detection in videos, and appli-
social network analysis, recommender systems, cations of deep learning in other image and video domains.
machine learning, and swarm intelligence.

173226 VOLUME 8, 2020

You might also like