0% found this document useful (0 votes)
144 views

COMP1680 Coursework 1

The document is a final report analyzing cloud computing platforms and costs for running high performance computing (HPC) workloads. It evaluates Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and building an on-premise HPC cluster. AWS and Azure are good options for their ease-of-use and pay-as-you-go models. Google Cloud's pricing is less clear but machine options are well organized. An HPC cluster could process jobs faster but would require maintenance costs and hiring staff to manage it. Overall costs of each approach are analyzed to make a recommendation.

Uploaded by

Tev Wallace
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
144 views

COMP1680 Coursework 1

The document is a final report analyzing cloud computing platforms and costs for running high performance computing (HPC) workloads. It evaluates Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and building an on-premise HPC cluster. AWS and Azure are good options for their ease-of-use and pay-as-you-go models. Google Cloud's pricing is less clear but machine options are well organized. An HPC cluster could process jobs faster but would require maintenance costs and hiring staff to manage it. Overall costs of each approach are analyzed to make a recommendation.

Uploaded by

Tev Wallace
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

COSK1003 Final Report

Tev Wallace
001179111

University of Greenwich
MSc Data Science
Word Count: 2520
Contents

1 An Introduction to Cloud Computing 3

2 Platform Analysis 5
2.1 Cloud Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 HPC Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Cost Analysis 9
3.1 AWS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Azure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Google Cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4 Physical HPC cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4.1 Upfront Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.4.2 Costs Over Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Recommendation 12

Bibliography 13

2
Chapter One

An Introduction to Cloud Computing

Cloud computing generally refers to the use of computer resources which the user connects
to via the Internet, as opposed to using the physical resources in person. Cloud computing
has grown massively in the past few years, to become a $150 billion dollar industry
(Richter, 2021). It has quickly become overtaken traditional IT infrastructure, allowing
large companies to outsource their needs to the cloud, and allowing all manners of industry
access to High Performance Computing (HPC) (Kepner, 2004). Companies such as Netflix,
Moderna, and Pinterest using AWS (Amazon, 2021b); Metro Bank, Ubisoft, and Rolls
Royce using Azure (Microsoft, 2021b); and Paypal, Twitter, and Airbus using Google
Cloud(Google, 2021b).
Although it has many applications, our company is seeking to use it to run parallel
code using OpenMP and MPI. Cloud computing allows us to use multi-core virtual
machines without upgrading our current hardware. Instead, the hardware is usually found
at data centres run by cloud computing companies, being up to 7.2 million square feet in
size (Switch, 2021). These data centres contain a massive number of computing nodes,
being used by thousands of individuals and companies at once. These virtual machines
can be accessed easily through specific software, or often even through a web browser. As
opposed to paying for the hardware itself, we would be paying for the amount of time we
used the virtual machines for, as well as what type of machines were used.
Part of the reason that HPC on the cloud has become so popular is due to how
cost-effective it can be, especially for smaller companies. Due to economies of scale, it
becomes much cheaper per machine to run an HPC cluster the more machines you have
(Harms and Yamartino, 2010). These companies also benefit from the machines almost
always working at full capacity, meaning that they’re losing very little efficiency from any
CPUs being idle. Due to the breadth of their customers, the overall variability over time

3
of how much computing power is needed is much smaller, meaning that the companies do
not have to plan for as large ’surges’ in use. These factors allows these large data centres
to run their machines at a much lower cost than if a company were to run their own,
much smaller cluster, and therefore at a lower cost to the customer. Although there are
still arguments for smaller, ’community’ clusters (Carlyle et al., 2010), the general trend
has moved to hybrid or full cloud integration (Netto et al., 2018).

4
Chapter Two

Platform Analysis

2.1 Cloud Services


There are several different platforms and companies that offer cloud computing services,
so we have decided to analyse the top 3 companies by market share, Amazon Web Services
(AWS), Microsoft Azure, and Google Cloud, as can be seen in 2.1. All 3 services work
quite similarly, allowing the user to hire virtual machines for a set cost per hour of use.
We could quite easily hire an individual virtual machine for every consultant.
Our consultants will on average use 800 CPU hours per month. If we were to give
each consultant access to an 8 core virtual machine, this work will take 100 hours total,
or an average of 5 hours a day in a 20 day working month. One of the useful aspects of
these virtual machines is that they can be run overnight without any maintenance on our
end, allowing more complex analysis to be run for longer hours. 32GB of RAM will cover
most datasets that our consultants are likely to come across, and 32GB of storage will
also allow these datasets and models to be stored easily.
One issue of hiring these virtual machines, is that if any consultant needs a larger
amount of memory, or more processors, then they will have to go through the head of IT
to get this virtual machine set up, as these machines are being hired out as a company,
as opposed to on a per person basis. This can be an issue when dealing with big data,
or large simulations. This is usually controlled through a resource manager of some
description, for example Azure Resource Manager (Tfitzmac, 2021) or AWS Resource
Access Manager (Kwok, 2021).
On the plus side, by using pay as you go, if there are any months where consultants
use less or more hours on their machines, our costs simply scale dynamically with the
amount of time used. This is instrumental at keeping our costs down, as our consultants

5
do not need access to these machines full time, which would massively increase our costs.
There is very little difference in terms of the hardware used by the three companies,
so the differences tend to be in their approach to being a cloud host.
Azure seems to be made for businesses and consultants with some expertise in what
they’re looking for. Several different types of machines are on offer, often with the
hardware seeming to be the same, when the machines in reality have slightly different
specifications for their specific purposes. This can make pricing difficult to understand at
times, as a user with little experience may accidentally select a machine with the correct
resources, but a higher cost per hour.
AWS on the other hand is much more user-friendly, allowing easy setup of virtual
machines for the non-expert, only having to specify the number of cores and RAM,
with AWS selecting the best value machine for your money. This is ideal for managing
resources whilst keeping costs low, but can lead to some resources not being cut out for
what consultants need.
Google’s pricing system is not very user friendly, only showing the total cost of all
instances, making it hard to determine the per hour cost of each machine. Their is a
much more limited range of machine options, but they are also far better organised than
Azure’s lists, allowing easy selection of the right machine for the job.

2.2 HPC Cluster


Our other option as a company is to build our own HPC cluster for our dedicated use.
Ideally, our HPC cluster should be able to handle all of our consultants working at once,
to a similar capacity as our cloud options. This means we need 8 cores per consultant,
for a total of 400 cores, as well as at least 32GB of RAM per consultant, and 32GB of
storage per consultant. This will theoretically allow all of our consultants to be using the
cluster at the exact same time.
Using a physical cluster will work slightly differently from using virtual machines.
Most HPC clusters use a job queue system, for example the University of Greenwich HPC
(University of Greenwich, 2021). Each consultant would submit their job to the queue,
with which a master node would split these jobs between the available nodes and cores.
This has some benefits and detriments of its own.
One benefit is that it becomes much easier to allocate a different number of cores
and memory for each job, as these can be specified when the job is sent in, and if the
cores are available, then the job can be sent to these cores. This is especially useful when
consultants need a quick turnaround, so can utilise many cores at once.

6
This cluster would also be just for our own use, which means that we would never
have to worry about our workloads being throttled down, as happens on some low priority
virtual machines from the cloud. This will allow us to have all programs run at the
maxmimum speed possible, increasing efficiency.
Unfortunately, as we are also in charge of the physical hardware, we will also be
responsible for maintaining it. This comes in the form of maintenance costs if any
hardware breaks, in which case new parts will need to be ordered in, and most likely we
will have to pay for the repair.
HPC clusters also come with a high power consumption, especially as they are often
on for long periods of time. Our electricity bill will increase accordingly, although a
calculation of cost is included in the analysis below.
Finally, if we were to buy a cluster, we would need to hire a new employee to manage
the machine, known as an HPC architect. An HPC architect would be responsible for the
general upkeep of the machine, as well as the hardware and software knowledge necessary
for the machine to work.

2.3 Comparison
Using a cloud service comes with the ease of not having to worry about maintaining
hardware, and the more fine details of how an HPC cluster works. The only knowledge
needed would be how to log on to a virtual machine, as well as having individuals in IT
with knowledge of cloud resource management. Owning a physical cluster does not have
these benefits, requiring a new employee, as well as the associated costs of electricity,
repair and maintenance.
However, the physical cluster will allow much more flexible scalability of jobs, allowing
consultants to request the resources they need without having to go through a resource
manager. The cluster will be ours to control, and we would not be forced to slow down
our programs for any reason.

7
Figure 2.1: A chart showing the share of revenue from Q1 2021 in the Cloud Market, by
Richter (2021)

8
Chapter Three

Cost Analysis

Below is an in-depth analysis of the costs associated with each platform. Table 3.1
containing the per month, 1 year and 5 year costs can be found at the end of the chapter
for ease of reading.

3.1 AWS
We used the AWS price calculator (Amazon, 2021a), selecting an EC2 machine in London.
Using Linux as an operating system, with 8 vCPUs and 32GiB of RAM selects the
t4g.2xlarge as the type of virtual machine to be used, which has an hourly cost of $0.3008.
With 50 instances at 100 hours a month, this works out to $1504 per month. We also add
on 32GB of SSD storage to each instance, for $185.60 a month. This totals to $1689.60 a
month, or £1266.19 a month (as of 14:09, 01/12/2021).

3.2 Azure
We used the Azure price calculator (Microsoft, 2021a), selecting a virtual machine in UK
South. We choose a Linux operating system, either Ubuntu or CentOS, selecting General
Purpose Machines. We can then select the B8ms series, which have 8 cores, 32GB of
RAM, and have an hourly cost of $0.378. With 50 instances at 100 hours a month, this
comes to $1890 per month. We also add 50 32GB E4 SSD’s each costing $2.64 per month,
so $132 a month. There is also a $0.20 a month charge for storage transactions. This
totals to $2022.20 per month, or £1515.49 a month (as of 14:15, 01/12/2021).

9
3.3 Google Cloud
We used the Google Cloud price calculator (Google, 2021a), selecting our machines to
be in London (europe-west-2). We choose a free OS, a general purpose machine, and
then choose e2-standard-8 machines, each with 8 vCPUs and 32GB of RAM. We choose
an SSD persistent disk with a capacity of 32GB. We select 50 instances for 100 hours a
month each. This totals to £1540.04 per month (as of 14:20, 01/12/2021).

3.4 Physical HPC cluster


3.4.1 Upfront Costs
Buying an HPC is a slightly more complicated process, and requires a few more steps.
Luckily, companies exist that sell nodes of an HPC with all the components already
installed. A 4 node server from Scan.co.uk (2021) comes with 2 AMD EPYC 7252
processors per node, each containing 8 cores running at 3.1GHz. Each node also comes
with 64GB of 2933MHz RAM, and a 250GB SSD. This means each node can handle 2
consultants each, with 8 cores and 32GB of RAM, with ample storage space as well. To
cover all 50 consultants, we need a total of 25 nodes. As these nodes come in batches of
4, we can buy 7 multi-node servers, for a total of 28 nodes. One of these nodes can serve
as a master node, allowing some tasks to run on more cores than planned, or allowing for
additional consultants to use the machine if they are hired. Each of these nodes costs
£12913.36, coming to £90393.52 for the 7 servers.
We also need a server rack to hold the individual servers in. Each of the specified
servers has a size of 2U, totalling to 14U worth of space. A 24U server rack from Orion
(2021) costs £538, bringing our total upfront cost to £90931.52.
Installation itself will cost money, as well as any additional equipment needed. Hope-
fully this can be covered by 5% of the upfront cost, so £4546.58. This brings the purchase
and installation cost to £95478.10.

3.4.2 Costs Over Time


However, a physical HPC cluster will also have additional costs over time, including
electricity, repairs, as well as hiring an HPC architect to maintain the cluster.
Each server comes with a 2200W power supply. The total wattage of all 7 servers will
come to 15400W. Assuming a best case scenario where the HPC only needs to be run
for 8 hours a day every working day, this comes to 160 hours of electricity per month.

10
According to Energy Saving Trust (2021), the average cost per kWh for electricity is 20.06
pence. Our cluster will use 2464kWh per month, for a total of £494.28 a month. This is
a very rough estimate, as the cluster will most likely be left on for longer some days, but
also will not draw the full 2200W at all times.
Maintenance costs for a year will vary and are hard to accurately guess, so we will
use an estimate of 5% of the original cost of the cluster, so £4546.48. This works out to
£378.87 a month.
As this is a relatively small cluster, a single HPC architect should be sufficient to run
the entire system. According to (Glassdoor.co.uk, 2021), the average HPC architect in
the UK makes £58646 per year, or £4887.17 per month.
This brings our total monthly cost to £5760.32 a month

Table 3.1: Costs of each service over time


Service Per Month 1 Year Cost 5 Year Cost
AWS £1266.19 £15194.28 £75971.40
Azure £1515.49 £18185.88 £90929.40
Google Cloud £1540.04 £18480.48 £92402.40
HPC Cluster £5760.32 £164601.94 £441097.30

11
Chapter Four

Recommendation

The clear choice for both finances and ease of use is to pick a cloud service. The sheer cost
of buying the hardware as well as hiring an HPC architect is not worth it for the number
of core hours that our consultants use. An HPC cluster will cost us almost four times as
much money compared to a cloud service in the long run, with little benefit. The only
noticeable benefits that a cluster would give us is making it easier to change resources,
but even this is quite easy with cloud services due to the resource managers available. If
our numbers of consultants increase over the next few years, or our consultants begin
requiring more core hours on average, it may be benificial to once again look into acquiring
an HPC cluster, or possibly looking into a hybrid method of using both a small cluster
and cloud services alongside each other. For now, due to having to maintain the cluster,
hire someone to run it, and powering it, using a cloud service is the superior option for
our company.
Amazon Web Services is the obvious choice cost wise, saving 300 pounds per month.
Amazons interface also makes it far easier to select resources to add, making it far easier
for the resource manager to scale up resources by simply selecting the number of cores
and RAM necessary. Because of this, AWS is the clear choice for our cloud computing
needs, easily allowing our consultants to complete their 800 CPU hours of work on their
8 core machines, with larger core machines available if required.

12
Bibliography

Amazon. Amazon web services pricing calculator, Dec 2021a. URL https://calculator.
aws/#/createCalculator/EC2.

Amazon. Leading cloud innovators, Dec 2021b. URL https://aws.amazon.com/


solutions/case-studies/innovators/.

Adam G. Carlyle, Stephen L. Harrell, and Preston M. Smith. Cost-effective hpc: The
community or the cloud? In 2010 IEEE Second International Conference on Cloud
Computing Technology and Science, pages 169–176, 2010. doi: 10.1109/CloudCom.2010.
115.

Energy Saving Trust. Our data, Nov 2021. URL https://energysavingtrust.org.uk/


about-us/our-data/.

Glassdoor.co.uk. Hpc architect salaries, Dec 2021. URL https://www.glassdoor.


co.uk/Salaries/uk-hpc-architect-salary-SRCH_IL.0,2_IN2_KO3,16.htm?
clickSource=searchBtn.

Google. Google cloud pricing calculator, Dec 2021a. URL https://cloud.google.com/


products/calculator.

Google. Customers | google cloud, Dec 2021b. URL https://cloud.google.com/


customers.

Rolf Harms and Michael Yamartino. The Economics of the Cloud. White pa-
per, Microsoft, Nov 2010. URL https://news.microsoft.com/download/archived/
presskits/cloud/docs/The-Economics-of-the-Cloud.pdf.

Jeremy Kepner. Hpc productivity: An overarching view. The International Journal


of High Performance Computing Applications, 18(4):393–397, 2004. doi: 10.1177/
1094342004048533. URL https://doi.org/10.1177/1094342004048533.

13
Man-Ho Kwok. Aws resource access manager, Dec 2021. URL https://aws.amazon.
com/ram/.

Microsoft. Azure pricing calculator, Dec 2021a. URL https://azure.microsoft.com/


en-gb/pricing/calculator/.

Microsoft. Customer and partner success stories: Microsoft azure, Dec 2021b. URL
https://azure.microsoft.com/en-gb/resources/customer-stories/.

Marco A. S. Netto, Rodrigo N. Calheiros, Eduardo R. Rodrigues, Renato L. F. Cunha,


and Rajkumar Buyya. Hpc cloud for scientific and business applications: Taxonomy,
vision, and research challenges. ACM Comput. Surv., 51(1), jan 2018. ISSN 0360-0300.
doi: 10.1145/3150224. URL https://doi.org/10.1145/3150224.

Orion. 24u premier server rack 600 x 1000, Dec 2021. URL
https://www.rackcabinets.co.uk/collections/24u-racks/products/
24u-premier-server-rack-600-x-1000.

Felix Richter. Infographic: Amazon leads $150-billion cloud mar-


ket, Jul 2021. URL https://www.statista.com/chart/18819/
worldwide-market-share-of-leading-cloud-infrastructure-service-providers/.

Scan.co.uk. 3xs hpc 2u32e8, Dec 2021. URL https://www.scan.co.uk/3xs/


configurator/amd-epyc-7003-hpc-server.

Switch. Citadel, Dec 2021. URL https://www.switch.com/the-citadel/.

Tfitzmac. Azure resource manager overview - azure resource manager, Dec


2021. URL https://docs.microsoft.com/en-us/azure/azure-resource-manager/
management/overview.

University of Greenwich. High performance computing (hpc), Dec 2021. URL https:
//www.gre.ac.uk/it-and-library/computing/hpc.

14

You might also like