Benchmarking_Serverless_Workloads_on_Kubernetes
Benchmarking_Serverless_Workloads_on_Kubernetes
Abstract—As a disruptive paradigm in the cloud landscape, by Forrester [2] for the COVID-19 pandemic recovery period
Serverless Computing is attracting attention because of its unique i.e. by the end of 2021. Moreover, by using private/dedicated
value propositions to reduce operating costs and outsource cloud infrastructures, different organisations can arguably ful-
infrastructure management. Nevertheless, enterprise Function-
as-a-Service (FaaS) platforms may pose significant risks such as fill distinct regulatory and compliance requirements and offer
vendor lock-in, lack of security control due to multi-tenancy, com- enhanced security control.
plicated pricing models, and legal and regulatory compliance— The key contributions of this research are the following:
particularly in mobile computing scenarios. This work proposes • Develop a fault tolerant and highly available serverless
a production-grade fault-tolerant serverless architecture based architecture and verify the feasibility of open-source FaaS
on a highly-available Kubernetes topology using an open-source
framework, deployed on OpenStack instances, and benchmarked frameworks on private cloud.
with a realistic scaled-down Azure workload traces dataset. By • Benchmark serverless functions by developing a realistic
measuring success rate, throughput, latency, and auto scalability, workload model exploiting the workload characterization
we have managed to assess not only resilience but also sustained features of Azure production dataset.
performance under a logistic model for three distinct representa- • Perform data modelling by regression analysis to un-
tive workloads. Our test executions show, with 95%–confidence,
that between 70 and 90 concurrent users can access the system derstand the functional relationship between concurrency
while experiencing acceptable performance. Beyond the breaking and response times of various mobile workload patterns.
point identified (i.e. 91 transactions per second), the Kubernetes To evaluate our approach, we have deployed a master-
cluster has to be scaled-up or scaled out to meet the QoS and worker high-availability Kubernetes infrastructure on a pri-
availability requirements.
Index Terms—Serverless, OpenFaas, High Availability, Work-
vate OpenStack cloud infrastructure. Designed for clusters,
load modelling, Service Level Agreement, SLA, Mobile Comput- Kubernetes supports multiple “pods” across different systems
ing, Azure, Containers (physical or virtual machines), to allow seamless horizontal
scaling for dynamic workloads. Pods have been deployed as
I. I NTRODUCTION interconnected Docker containers integrated with OpenFaas to
Serverless computing is a recent cloud service model which explicitly underpin cluster interaction and monitoring respec-
allows stateless, short duration, and event-driven functions to tively
be executed on abstracted containers with granular scaling at We have benchmarked our approach using an open dataset
cloud datacenters. Serverless computing is commonly divided of Azure Cloud traces, and our findings suggest that auto-
into Backend-as-a-Service (BaaS) and Function-as-a-Service scalability of OpenFaas is seamless on OpenStack with the
(FaaS). proposed architecture. With the increase in concurrency, re-
While BaaS focuses on the traditional server-based com- sponse times increased and success rate decreased, which is
ponents such databases, authentication, storage, hosting, etc., consistent with the overall serverless trend. Our work has
FaaS allows developers to write the business logic in the form proved that by exploiting workload characterisation and imple-
of stateless functions, relieving programmers from operational menting a scaled down load leads to an appropriate workload
aspects of the underlying infrastructure. This paper will there- model. Based on these results, we have modelled response time
fore focus on the FaaS side. data set. Our results, i.e. the Logistic regression data model,
Widely popular because of its pay-per-use billing strategy, showed the number of concurrent users that can safely access
FaaS enables a “serverless function”, typically a code snippet, the application within SLAs with 95% confidence interval
to be executed on-demand on a operating system container in is within the range of 70 − 90 for a given cluster size.
response to triggered events. Open-source FaaS frameworks The data model can also be used to assess the performance
are vendor-agnostic and provide developers with the flexibil- of the serverless framework for any value of concurrency
ity of developing applications in multiple programming lan- without actually running the benchmarks. Hence, the workload
guages, effectively decoupling the underlying cloud platform model can be adopted for continuous performance testing
from the business logic. of real world serverless applications to avoid performance
The main motivations behind serverless adoption are cost regressions.
savings, reduced management overhead, and faster time to The rest of the paper is organised as follows. Section II
market. Gartner, a leading global research firm, forecasts that discusses the related work relevant to this paper. Section III
approximately half of global enterprises will embrace server- summarises the design details of the proposed HA architecture.
less by 2025, as of 20% today [1]. These insights are echoed Section III and Section IV describes the implementation
705
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
scalability issues with significantly lower latency. Some AWS service integration by default. NATS is used for queuing
EC2 instance types are subject to automatic throttling because and asynchronous execution of functions. When a function
of their CPU credits system. Hence, the prior test beds do not is deployed, it creates multiple Pods depending on the scaling
arguably represent production-like setup and the performance parameters set by the user. Functions can be scaled to zero and
results reported cannot serve as a real baseline. back again in OpenFaas by using faas-idler or REST API.
In contrast, we have modelled the workload for FaaS Function WatchDog converts Dockerfiles into serverless
performance evaluation using the characterisation of real FaaS functions. It serves as an entry point for the HTTP requests
traces. We also study the performance of the framework under by interacting with the processes and caller. It acts as an ”init
different load conditions for various load patterns. Finally, process” supporting health checks, concurrent requests and
we also perform data modelling on a Response time data set timeouts. OpenFaas supports two watchdog modes namely,
using statistical regression techniques. The confidence level, of-watchdog (HTTP) mode and classic mode. HTTP mode
statistical significance, and the functional relationship between is suited for resource-intensive or streaming operations. Of-
response time and concurrency is established, which will aid watchdog maintains the functions alive between invocations.
in effective business decisions and predictions. Besides, there
is no empirical literature identified on serverless with private IV. E XPERIMENTAL SETUP
clouds. This study addresses the gaps in the existing literature Our serverless architecture is composed of ten m1.xLarge
by proposing a fault tolerant serverless architecture on private Nova virtual machines on the NCI OpenStack private
cloud and benchmarking it with fully open-source technologies cloud https://cloud.ncirl.ie. Configured with 8 vCPUs, 160 GB
such as Kubernetes, OpenFaas, JMeter, OpenStack, Grafana, memory and 16GB of RAM, each virtual machine runs Ubuntu
Kubespray, and Python. 18.04 LTS. All the worker nodes have Docker v19.03 installed
and Kubernetes v1.19 is used for container management and
III. P ROPOSED A RCHITECTURE
orchestration, created using an Ansible-based Kubespray pro-
In this work, we have selected a High Availability (HA) visioner. Calico v3.16 is applied for container/pod networking
topology in order to provide high performance, stability, in the multi-master Kubernetes cluster. JMeter 5.3, used as
and fault tolerance. We have chosen a minimum of three HTTP trigger to the functions, is installed on the cluster
master nodes for redundancy based on Raft [17], a consensus as docker container. Ansible v2.9.6 and Jinja2 v2.11.1 are
algorithm which extends the seminal Castro-Liskov work [18], installed as pre-requisites for running Kubespray tool through
internally implemented via Etcd. For a distributed cluster with pip installer v20.2.4. CPU-intensive function is implemented
N master nodes, Raft permits the system to be operational in Python v3.6.9. OpenFaas serverless framework deployed on
until (N − 1)/2 partitions and receives the quorum of (N/2)+1 the testbed has the following core components installed by de-
master nodes on absolute value. For a cluster to be highly fault: OpenFaas Gateway v0.18.18, Prometheus v19.03, Queue
available with at least one node up and running all the time, Worker v0.11.2, Basic-auth plugin v0.20.3, Nats streaming
a minimum of three master/worker/etcd nodes are required. A server v0.19, Alert manager v0.16. Faas-cli v0.12.14 and
brief description of the architectural components is presented Grafana v4.6.3 dashboard are integrated with OpenFaas ex-
below: plicitly for cluster interaction and monitoring respectively.
A master node is responsible for the maintenance of the We have employed an open dataset of Azure work-
cluster. For all administrative activities in the cluster, the mas- load traces for performance evaluation, freely available
ter node serves as the entry point. A worker node is a physical in Github under the AzurePublicDataset-20191 . It con-
machine, device, rack server or a VM which is capable of tains 14 time series spanning from July 15th–29th, each
running Linux containers to provide run-time environment. for 24 hours, with July 20th, 21st, 27th, and 28th be-
Worker nodes maintain pods and are controlled by the master ing weekends, and the rest are weekdays. There are
node. Etcd is a distributed store of key values that is used three files available: ı) Function Invocation Counts; ıı)
to manage the cluster state. Config details including secrets, Function Execution Duration; and, ııı) Application Memory.
subnets and configmaps are also stored in etcd. Each external The salient features of the workload characterisation
etcd node in the cluster communicates with the master node are [19]:
via kube-apiserver.
1) On average, more than 50% of functions have execution
OpenFaaS is an open-source serverless framework for
time less than 1 second and 96% of functions execute
building cloud functions, avail-able in Github under the terms
for less than 60s.
and conditions of MIT license. OpenFaaS supports multiple
2) 90% of the applications allocate less than 400MB, and
runtimes such as C#, Go, Java, Python, Ruby, NodeJS, PHP,
50% of the applications consume a maximum of 170
Dockerfile for ARMHF and Dockerfile. The main architectural
MB.
components are: OpenFaas Gateway is used to deploy and
3) 54% of the applications possesses one function, and 95%
invoke functions. Users can interact with OpenFaas gate-
of the applications have a maximum of 10 functions.
way through a UI in OpenFaas. Prometheus monitors the
environment, tracks cloud-native metrics, reports the them 1 https://github.com/Azure/AzurePublicDataset/blob/master/
to the API gateway. OpenFaas ships with the Prometheus AzureFunctionsDataset2019.md
706
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
Figure 1. Proposed fault-tolerant serverless architecture. It uses a minimum of three master nodes for redundancy based on Raft, where master nodes serve
as the entry point.
707
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
significantly larger portion of applications. tool. OpenFaas ships with default Prometheus integra-
tion, a dashboard for monitoring and logging metrics of
The parameterisation of the simulation is as follows:
functions. Additionally, we have configured visualisation
• Choice of function: A quicksort function, which sorts dashboards using Grafana, an open-source monitoring
a random list of a thousand numbers in the range tool, to turn those metrics into useful dashboard view.
of [1, 10000] is used for simulating a CPU-intensive • Test scenarios: In this research, we are considering
workload, as checked into the public docker registry Spiky/bursty, Growing and Flat workload patterns, which
at dockerpullhma2308/rand:latest. are best representations of mobile cloud workloads, to
• Throughput: To calculate the Throughput or Transac- benchmark the System Under Test (SUT).
tions per second (TPS), we have considered the July 18th In Test scenario-1 (TS1), we consider an unpredictable
workload invocations per function md.anon.d04, which type peak workload with varying spikes row#16082 of
contains the number of invocations of each function Azure dataset. In this scenario, for studying the behaviour
on a per minute basis. A hash function mentioned on of auto scaling and concurrency, we consider Workloads
row#16082, has a maximum of 168943 transactions per scaled down by a factor of 40 (Concurrency series1),
minute. Peak load hour occurs from column ASM to 30 (Concurrency series2) and 10 (Concurrency series3).
AUT for row#16082 precisely. Hence, Peak T P S = Concurrency series1 data calculations is shown below:
168943/60 = 2185.7 for all the test scenarios in this The peak hour data consists of five samples. To scale
paper. The workload simulated in this research is scaled down the load by 30, we divide all the five TPS values by
down by a factor of 30, 40, 50, 10, 20, etc. to fit the 30. Scaled down T P S values =784/30 = 26; 2448/30
Test cluster configuration created on the NCI OpenStack = 81; 2815.7/30 = 93.66 and so on. TS1 dataset thus
private cloud https://cloud.ncril.ie. formed, 26, 81, 93, 76, 13, is seeded into JMeter tool.
• Trigger type: We have chosen HTTP triggers for invok- Hence, Peak Concurrency of TS1 series is 93. Similar
ing functions synchronously. logic is extended to Concurrency series2, series3, growing
• Workload model: We have fixed the number of concur- and flat workload types.
rent users or Thread count to 1500 and test execution In Test scenario-2, a Flat workload with constant requests
duration to 25 minutes with a Ramp up and Ramp per second, row#8922 of the Azure dataset. Test scenario-
down time of 10 minutes each. The Ramp-up and Ramp- 3 is a constantly increasing workload, row#41113 of the
down times are defined as per the recommended best Azure dataset. In both the cases, TPS value is calculated
practices to determine predicted delays between the start and scaled up to maintain peak TPS value for creating
of each concurrent user invoking the function. A Poisson enough load on the system.
distribution of Ramp-up time is purposely avoided and, Figures 2 3 4 are the .jmx files generated by JMeter
therefore, a fixed interval delay is chosen as per the work- for experiments. For testing the auto-scalability scenarios,
load characterisation results of Azure Workloads. The we deploy the function with min. replicas to 1 and max
scaled down TPS values are computed according to the replicas to 5, 10, 50, 70 and 100 using the OpenFaas
explanation mentioned above for varying the concurrency. labels.
Time gaps between Load Variations are set to 1 minute
a) Evaluation metrics: 1) Response time: The time taken
for all the workload patterns.
by the HTTP request for resolution; 2) Throughput (TPS): It is
We have set the minimum function replicas to 1 in order
defined as the number of HTTP requests/transactions satisfied
to avoid cold starts as their presence causes significant
per second. In this case, the TPS values are derived from the
delays in performance evaluation. The average throughput
Azure data set; 3) Success rate: The ratio of the number of
value is auto calculated by JMeter-Throughput by shaping
successful transactions to the total number of transactions. 4)
the timer for the corresponding TPS values seeded for test
Auto-Scalability: It is the ability of the FaaS framework to
executions. The memory and CPU values in OpenFaas are
scale-up or scale-down the function deployment on-demand;
set to defaults to avoid the throttle of CPU performance
5) Fault tolerance: It is the capability of the system to ensure
based on memory RAM size and OpenFaas works just
zero down time; and, 6) High availability: It is the ability
like a Kubernetes pod scheduler.
of architecture to maintain replicas of nodes to act against a
• Test Harness: JMeter, the open-source distributed load
single node failure.
testing tool, is used to generate the load for each test sce-
nario considered. Blazemeter Throughput Shaping Timer,
an opensource JMeter plugin is installed separately to
simulate the workloads for a given Invocations per second V. P ERFORMANCE T EST R ESULTS
(TPS) value. The Ultimate Thread Group provided by
the JMeter plugins library is chosen to create realistic We are benchmarking the SUT with various load patterns
load profiles with concurrent users, load duration, ramp and measure the Average response time metric. As Throughput
up times, etc. Finally, the test results or .jtl logs can is a fixed value, success rate and Response time will vary for
be viewed by the ’Aggregate Report’ feature of JMeter different scenarios.
708
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
Figure 5. OpenFaas Grafana dashboard depicting 200-success and 502-failure for Test scenario-3
Figure 6. OpenFaas Grafana dashboard depicting execution duration for Test scenario-3
Figure 7. OpenFaas Grafana dashboard depicting Replica scaling for Test scenario-3
Response Time
Success rates
100000
218 users 120
93 users 218 users
70 users 93 users
70 users
100
10000
Response Time (ms)
80
Success Rate (%)
1000
60
40
100
20
10
0
1
10
50
70
10
10
50
70
10
0
Function Replicas
Function Replicas
Figure 8. Impact of Auto scaling on Response times depicted for Spiky Figure 9. Impact of Auto scaling on Success rate depicted for Spiky workload
workload
709
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
Function Average Through- Success Response Time
Replicas Response time (ms) put Rate 100000
1 replica
—Concurrency series1— 5 replicas
100 replicas
1 99 16.6 100%
5 99 16.5 100% 10000
70
93
21
8
us
us
—Concurrency series3—
us
er
er
e
s
rs
1 6757 41.2 36.41%
Concurrency
5 5857 43.6 48.81%
10 5740 45.0 72.2%
50 5649 42.6 77.81% Figure 10. Impact of Concurrency on Response times depicted for Spiky
70 5600 40.0 78.8% workload
100 5258 44.5 86.62%
Table II
P ERFORMANCE EVALUATION RESULTS OF S PIKY WORKLOADS FOR Success rates
VARIOUS SCALED DOWN CONCURRENCY AND FUNCTION REPLICA (TPS) 120
VALUES . 1 replica
5 replicas
100 replicas
100
Success Rate (%)
80
93
21
the Kubernetes HPA scheduler, which is again dependent on
8
us
us
us
er
er
er
s
s
the available memory of the cluster. From the Replica count
s
Concurrency
graphs in Grafana (see figure 5), it is evident that OpenFaas
demonstrates a seamless auto scaling capability on OpenStack. Figure 11. Impact of Concurrency on Success rate depicted for Spiky
Our Faas-Grafana dashboard also captures the response code, workload
i.e. 200 for success and 502 for a bad gateway error as shown
in figure 5. It is interesting to note that the average response
time increased with auto-scaling slightly for Workload A. This B. Flat and Growing workload scenarios
could be because of the performance degradation of OpenFaas Table III show the response times and success ratios of
gateway when there are too many function replicas serving growing and Flat type workloads with varying function repli-
the users simultaneously. In the second and third scenarios cas and with TPS or concurrency or Peak Invocations per
with higher concurrent requests/per workload, the success rate second value equals 93, which is scaled down by a factor
gradually has increased from 67.39% to 100% and 36.41% of 30.
to 86.62% respectively with the increase in replica count as Statistical similarity tests have been run on the different
depicted in figure 8 9. iterations of the same test scenario to check the similarity of
1) Impact of Concurrency: Figures 10 and 11 presents the result datasets. We have recorded five iterations of Test
the results of the performance evaluation of three concurrent scenario-2 results with function replicas ranging from 5 to
workloads following a spiky load pattern. As the concurrency 100 in intervals of 5. A Friedman test, with 95% confidence
of the workload increased, response time increased dramati- interval, indicate that our test iteration results are statistically
cally and success rate decreased, observed for replica count similar, χ2 (4) = 3.9, p = 0.45.
= 1 specifically in our experiments. These are inline with We conduct similarity tests between various workload types
the research findings of Palade [11] and Mohanty et al. [10], with same concurrent users to assess the statistical difference
who evaluated OpenFaas on edge devices, edge networks and between the serverless performance of various workload pat-
virtual machines using managed Kubernetes and VMs setup terns (Spiky, Flat and Growing types). A Kruskal-Wallis H
by Virtualbox software.
710
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
Response
4000
FLAT WORKLOAD Statistic df Sig.
Function Average Through- Success Kolmogorov-Smimov 0.537 21 0.000
Replicas Response time (ms) put Rate Shapiro-Wilk 0.231 21 0.000
1 7936 53.1 54.88% Table IV
5 105 59.9 100% N ORMALITY TEST RESULTS FOR VARIOUS FUNCTION REPLICAS OF S PIKY
10 104 60.2 100% WORKLOAD R ESPONSE TIME DATA SET.
50
70
106
106
59.9
59.9
100%
100%
2000
100 107 60.0 100%
GROWING WORKLOAD Response Times data model
Function Average Through- Success 12000
Replicas Response time (ms) put Rate
1 3632 30.5 75.53%
10000
5 101 30.8 100%
10 102 30.7 100% 0
50 102 30.7 100% 8000
60 80 100 120 1
Figure 13. Response time models of all workloads (Spiky, Growing, and Flat)
for various concurrency values and replica size = 100; Peak concurrency of
(70 − 90 users per second– zoomed) can meet the performance SLA of 100
ms.
711
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.
the other hand, the short box for Flat workload shows that From an application perspective, we intend to consider
the response time values for distinct concurrency valued are mobile data networks, specifically using call data records, to
without much variation. It can also be deducted that there is design a layered strategy to cope with unexpected loads as
an obvious difference in response times of these workloads, part of mobile communications in emergency situations.
which is also confirmed by the Kruskal-Wallis H test results.
R EFERENCES
It is interesting to note that beyond the breaking point, the
Flat workload has highest error rate and response time of [1] A. Chandrasekaran and C. Lowery, “A CIO’s Guide to Serverless
Computing,” Gartner Research, Industry Report ID: G00465766, Apr.
9500 seconds as observed from the data model presented in 2020, (Last accessed: 15/Dec/2020). [Online]. Available: https://www.
Figure 13. gartner.com/smarterwithgartner/the-cios-guide-to-serverless-computing/
The study of various mean rank values returned by the [2] D. Bartoletti et al., “Predictions 2021: Cloud Computing,” Forrester,
Industry Report, Oct. 2020, (Last accessed: 15/Dec/2020). [Online].
Kruskal-Wallis H test on various workload patterns indicates Available: https://www.forrester.com/fn/51A83KxURjmofUAEV7bCKR
that the Spiky workload has significantly higher response [3] D. Bardsley, L. Ryan, and J. Howard, “Serverless performance and
time than the growing or steady state workload of the same optimization strategies,” in 2018 IEEE SmartCloud. New York: IEEE,
Sep. 2018, pp. 19–26.
concurrency. Figure 13 plots the data modelling graph, where [4] G. McGrath and P. R. Brenner, “Serverless computing: Design, imple-
the horizontal line represents the performance goal or the mentation, and performance,” in 2017 IEEE ICDCSW. Atlanta: IEEE,
QoS, and the three curves represent the results from the result Jun. 2017, pp. 405–410.
[5] D. Jackson and G. Clynch, “An investigation of the impact of language
models. Observing where these curves cross the horizontal line runtime on the performance and cost of serverless functions,” in 2018
shows the number of users that can safely access the system IEEE/ACM UCC. Zurich: IEEE, Dec. 2018, pp. 154–160.
in each case while still meeting the stated performance goal. [6] K. Figiela et al., “Performance evaluation of heterogeneous cloud
functions,” Concurrency and Computation: Practice and Experience,
The combined plot can be read as follows: vol. 30, no. 23, p. e4792, 2018.
The test executions show, with 95%–confidence, [7] M. Pawlik, K. Figiela, and M. Malawski, “Performance evaluation of
parallel cloud functions,” in ICPP 2018. Oregon: ACM, Aug. 2018,
that between 70 and 90 i.e. concurrent users can pp. 1–2.
access the system while experiencing acceptable [8] H. Lee, K. Satyam, and G. Fox, “Evaluation of production serverless
performance. Beyond the breaking point identified computing environments,” in 2018 IEEE CLOUD. San Francisco: IEEE,
Jul. 2018, pp. 442–450.
(i.e. 91 transactions per second–see Figure 13), the [9] D. Pinto, J. P. Dias, and H. S. Ferreira, “Dynamic allocation of serverless
Kubernetes cluster has to be scaled-up or scaled out functions in IoT environments,” in 2018 IEEE EUC. Bucharest: IEEE,
to meet the QoS and availability requirements. Oct. 2018, pp. 1–8.
[10] S. K. Mohanty, G. Premsankar, and M. di Francesco, “An evaluation of
open source serverless computing frameworks,” in 2018 IEEE Cloud-
VII. C ONCLUSIONS AND F UTURE WORK Com. Nicosia: IEEE, Dec. 2018, pp. 115–120.
As part of this research, we have studied the suitability of [11] A. Palade, A. Kazmi, and S. Clarke, “An evaluation of open source
serverless computing frameworks support at the edge,” in 2019 IEEE
open-source serverless offerings for private cloud, specifically, SERVICES. Milan: IEEE, Jul. 2019, pp. 206–211.
[12] I. Baldini et al., “Serverless computing: Current trends and open
we have evaluated the OpenFaas framework and the modelled problems,” in Research Advances in Cloud Computing, Dec. 2017, ch. 1,
workload produced realistic insights of the underlying system. pp. 1–20.
Our study shows that the relation between concurrency and [13] A. Randal, “The ideal versus the real: Revisiting the history of virtual
machines and containers,” ACM Computing Surveys, vol. 53, no. 1, pp.
response time follows a Logistic model for all categories of 5:1–31, Feb. 2020.
workloads. [14] V. Medel et al., “Characterising resource management performance in
Kubernetes,” Computers & Electrical Engineering, vol. 68, pp. 286–297,
This project has leveraged the Python of-watchdog template 2018.
for benchmarking CPU-intensive workload. Future research [15] A. Pereira Ferreira and R. Sinnott, “A performance evaluation of
includes evaluation of Classic watchdog and different modes containers running on managed Kubernetes services,” in 2019 IEEE
CloudCom. Sydney: IEEE, Dec. 2019, pp. 199–208.
of of-watchdog (forking and HTTP modes) for various run- [16] H. Martins, F. Araujo, and P. da Cunha, “Benchmarking serverless
times to determine which has the best performance for CPU, computing platforms,” Journal of Grid Computing, vol. 18, p. 691–709,
memory, network, and IO-intensive workloads. Implementing 2020.
[17] D. Ongaro and J. K. Ousterhout, “In search of an understandable
those resultant templates for specific scenarios can improve consensus algorithm,” in USENIX ATC ’14. Philadelphia: USENIX
the overall performance. Association, Jun. 2014, pp. 305–319.
The limitations of the current research are classified into [18] M. Castro and B. Liskov, “Practical byzantine fault tolerance,” in OSDI
’99. New Orleans: USENIX Association, Feb. 1999, pp. 173–186.
two broad categories: the cluster size and test duration. Large [19] M. Shahrad et al., “Serverless in the wild: Characterizing and optimizing
scale tests with much longer time frames (in the order of days) the serverless workload at a large cloud provider,” in ATC 20. USENIX
with more test iterations would provide more accurate mea- Association, Jul. 2020, pp. 205–218.
[20] H. González-Vélez and M. Cole, “Adaptive statistical scheduling of
surement of the system response over time as well as a more divisible workloads in heterogeneous systems,” Journal of Scheduling,
comprehensive understanding of any scalability constraints. vol. 13, no. 4, pp. 427–441, 2010.
Another interesting avenue for future research is to study [21] M. Danelutto et al., “Algorithmic skeletons and parallel design patterns
in mainstream parallel programming,” International Journal of Parallel
other types of serverless workload patterns, particularly those Programming, pp. 1–22, 2020, In Press.
connected to divisible workloads [20] and, in general, struc-
tured parallelism [21] to determine the sequence of workloads
patterns causing highest and least resource utilisation.
712
Authorized licensed use limited to: VinUni. Downloaded on March 15,2025 at 15:27:45 UTC from IEEE Xplore. Restrictions apply.