0% found this document useful (0 votes)
32 views

Weka On Azure Performance Benchmark

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Weka On Azure Performance Benchmark

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

WEKA on Azure

Performance
Benchmark
March 2023
White Paper
WH I TE PAPE R 2

Executive
Summary

The WEKA Data Platform is available on Azure and


customers can now deploy WEKA through the Azure
Marketplace. WEKA is also eligible for use with Microsoft
Azure Customer Commitment (MACC) program, WEKA on
Azure deployments counts towards a customer’s ACC.
We recently completed an internal benchmark test for
WEKA running on Azure and compared it with publicly
available data on Azure NetApp Files (ANF). At a high level,
we found that WEKA outperforms ANF by a factor of
6x (8k random write IOPs) to 30x (64k sequential read
GBps) and is approximately 70% more affordable. The
primary driver of the WEKA costs savings is the need
for far less NVMe flash storage to deliver a comparable
or better performance level than ANF (5TB vs 55TB)
and the ability to expand the WEKA namespace
too Azure Blob object storage in a non-disruptive
manner, for massive capacity at a better price.

2
WH I TE PAPE R 3

Contents

Introduction:................................................................................................................................................................................................................................................................................................................................................................................................... 4

Configurations:......................................................................................................................................................................................................................................................................................................................................................................................... 5

Methodology:............................................................................................................................................................................................................................................................................................................................................................................................... 5

Results:..................................................................................................................................................................................................................................................................................................................................................................................................................... 6
WEKA on Azure (5TB NVMe and 50TB Azure Blob) versus Azure NetApp Files (55TB NVMe)........................................................................................................................................... 6
WEKA on Azure (55TB NVMe) versus Azure NetApp Files (55TB NVMe)............................................................................................................................................................................................................7
A Note on WEKA Solution Costs....................................................................................................................................................................................................................................................................................................................................... 8

Conclusion:....................................................................................................................................................................................................................................................................................................................................................................................................... 8

3
WH I TE PAPE R 4

Introduction:
In the Microsoft Azure ecosystem, that rely on high-performance compute
there are many file storage options and machine learning. Customers across
available to use. As a first party service, a diverse set of industries including
there is Azure Files, Azure NetApp Files, electronic design (EDA), media and
and Avere FXT. In the Azure Marketplace, entertainment (visual effect rendering,
Qumulo is also available as a preview. post-production, color correction, and
The most prevalent “high performance” studio in the cloud), life sciences (drug
file system in the Azure ecosystem is discovery, genomics processing), machine
Azure NetApp Files (ANF), having learning (natural language processing,
shown that it is higher performing than ML inference, autonomous vehicle
Azure Files, has a broader range of development), financial services (fraud
features, and can be deployed as a detection, back-testing, analytics), oil and
first party service. gas exploration and high-performance
data analytics can now accelerate more of
Recently, WEKA has brought our
their workloads with the industry’s fastest
industry-leading high-performance,
and most scalable file system in the cloud.
highly scalable filesystem to Azure.
With this offering, customers have To gain an understanding for how WEKA
the ability to accelerate their next- performs on Azure, we tested a minimum
generation workloads in the cloud. configuration of WEKA on Azure
Now Azure customers can bring deployment and compared against public
demanding workloads to the cloud published results for ANF available here.

4
WH I TE PAPE R 5

Configurations:
The ANF environment as described here was configured bandwidth of 12.5Gb/s. Networking was a standard 1500
as a 50TiB(55TB) capacity pool at the premium service MTU size. Weka FS was installed on top of the 6x L8s_v3
level. The ANF premium service level defines up to VMs to form a WEKA cluster. A single D32s_v3 VM client
64MiB/sec per 1 TiB of capacity, resulting in a maximum was used, and the storage was mounted using the WEKA
of 3200MiB/sec of attainable performance. The test POSIX client with 2 CPU cores assigned to power the
had a 48TiB volume provisioned within that capacity, client in DPDK mode. To support a 55TB (50TIB) total
which impacts the throughput by reducing the maximum capacity pool, the WEKA namespace can be extended
available down to 3072MiB/sec. The connected client was to Azure Blob Store for massive capacity at a low cost,
a single D32s_v3 VM. The protocol to mount the volume which we address in the cost comparison section of this
to the VM is not published, but we believe it was NFS document.
based on overall performance of the configuration and
information provided in the benchmark. As a true apples-to-apples comparison, we also modeled
the performance of a 55TB NVMe deployment for WEKA
We tested a WEKA environment consisting of a cluster of 6 on Azure to compare against the all-flash ANF benchmark.
L8s_v3 VMs in Azure, each having access to 1.92TB NVMe This configuration consisted of an 18 VM cluster using
storage. The raw capacity of all NVMe devices in the L16s_v3 Azure VMs for the WEKA cluster (with no Azure
cluster was therefore 11.52TB, with a total WEKA capacity Blog storage). This comparison allows us to examine
available to the client and presented as a single filesystem the total solution performance and cost for all-flash
of 5.18TB (4.71TiB). Each VM has 8vCPUs (4 cores/4 deployments of both ANF and WEKA on Azure.
hyperthreaded) and up to 4 NICs with an aggregated VM

Methodology:
A set of FIO tests as described in the ANF benchmark test (linked above). These consisted of a 4-corners set of
tests that evaluated IO patterns for reads and writes, both random small block and sequential larger block variations.
8k IO size tests are good for measuring the number of IOPs a system can generate, while 64k IO size tests are good for
assessing the maximum throughput a system can deliver.

The following commands were issued using FIO to generate the IO.

• 8k 100% random write • 64k 100% sequential write


fio --name=8krandomwrites --rw=randwrite --direct=1 fio --name=64kseqwrites --rw=write --direct=1
--ioengine=libaio --bs=8k --numjobs=4 --iodepth=128 --ioengine=libaio --bs=64k --numjobs=4 --iodepth=128
--size=4G --runtime=600 --group_reporting --size=4G --runtime=600 --group_reporting

• 8k 100% random read • 64k 100% sequential read


fio --name=8krandomreads --rw=randread --direct=1 fio --name=64kseqreads --rw=read --direct=1 --ioengine=libaio
--ioengine=libaio --bs=8k --numjobs=4 --iodepth=128 --bs=64k --numjobs=4 --iodepth=128 --size=4G --runtime=600
--size=4G --runtime=600 --group_reporting --group_reporting

5
WH I TE PAPE R 6

Results:
There are a few ways to assess the relative performance and value of WEKA on Azure versus Azure NetApp Files. First,
we’ll compare the relative performance and cost of the WEKA on Azure test configuration (5TB NVMe cluster and 50TB
Azure Blob storage) with publicly available ANF performance and cost information. Second, we’ll compare an all-flash
deployment of WEKA on Azure (using 55TB NVMe and no Azure Blob storage) with the ANF deployment.

WEKA on Azure (5TB NVMe and 50TB Azure Blob) versus Azure NetApp Files (55TB NVMe)
On a per TB basis, WEKA outperforms ANF on every metric (Table 1). From an IOPs perspective, WEKA shines on read
IOPs/TB outperforming ANF by a factor of 34x. For bandwidth, WEKA is strong on both read (21x faster) and write
(10x faster) MBps/TB. From a latency perspective, while ANF did not share latency data, we found the WEKA test
configuration never exceeded 300 microseconds latency.

Table 1: Performance per TB of provisioned storage1

Azure NetApp Files WEKA on Azure 5TB WEKA


55TB Flash NVMe 50TB Blob Advantage

8k 100% random read 1,310 IOPs/TB 44,750 IOPs/TB 34x

8k 100% random write 1,380 IOPs/TB 9,660 IOPs/TB 7x

64k 100% sequential read 26.1 MBps/TB 561 MBps/TB 21x

64k 100% sequential write 26.9 MBps/TB 282 MBps/TB 10x

Please recall the original WEKA on Azure test configuration was conducted on a minimum configuration WEKA cluster
consisting of 6 L8s-v3 Azure VMs. One way to think about the comparison is that the ANF test data was running
on a total flash storage capacity of 55TB, while the WEKA configuration consisted of 5TB of flash storage, with the
single WEKA namespace then extended to an additional 50TB of Azure Blob storage. This is particularly helpful when
considering the TCO for these solutions.

To compare the total solution cost for WEKA on Azure versus ANF, we used publicly available list pricing for both
solutions. Table 2 compare the total solution cost for the WEKA on Azure deployment of 5TB NVMe flash with 50TB
Azure Blob storage versus the listed ANF configuration. The total solution cost for 55TB of Azure NetApp Files (50TIB)
comes to $15,062 per month, or $273 per TB per month, using the Azure NetApp Files pricing calculator.

The total solution cost for the minimum configuration WEKA deployment used in the benchmark test (6 Azure
L8s-v3 VMs and 50TB Azure Blob storage) came to $4,683 per month, or $85 per month, which is 70% less than the
ANF configuration. So, the minimum configuration WEKA on Azure outperforms the available ANF benchmark by a
factor of 7x to 34x, at 70% less cost. The primary driver of the cost savings is the need for less NVMe flash storage to
deliver comparable performance, and the ability to expand the WEKA namespace with Azure Blob object storage for
massive capacity at a better price.

6
WH I TE PAPE R 7

Table 2: Comparing Total Solution Costs for ANF versus WEKA on Azure2

ANF 55TB flash WEKA 5TB flash + 50TB object

Cost/month $15,062 $4,683

Cost/TB/month $273 $85

WEKA on Azure (55TB NVMe) versus Azure NetApp Files (55TB NVMe)
Our second comparison focused on similar configurations of 55TB all-flash systems for WEKA and ANF. For the
WEKA environment, this would require an 18 VM cluster using L16s-v3 Azure instances (and no Azure Blob storage).
WEKA again outperforms ANF by a wide margin on every metric. Table 2 below provides an “apples-to-apples”
performance comparison of the two.

Table 3: Total Solution Performance Comparison3

Azure NetApp Files WEKA on Azure WEKA


55TB NVMe 55TB NVMe Advantage

8k 100% random read 73,000 IOPs 414,000 IOPs 6x

8k 100% random write 69,000 IOPs 437,000 IOPs 6x

64k 100% sequential read 1.3 GB/sec 39.8 GB/sec 30x

64k 100% sequential write 1.3 GB/sec 20 GB/sec 15x

When we look at the solution cost for 55TB all-flash configurations, the ANF configuration and pricing remains the
same. The WEKA solution cost (using 18 L16s-v3 Azure VMs for the WEKA cluster and pricing using 1-yr reserved
savings plan) comes to $15,239 per month or $277 per month, which is a comparable total solution cost, even though
this configuration significantly outperforms ANF on an absolute basis.

Table 4: Total Solution Costs for all-flash 55TB Configurations4

ANF WEKA

Cost/month $15,062 $15,239

Cost/TB/month $273 $277

7
WH I TE PAPE R 8

A Note on WEKA Solution Costs


The solution cost for WEKA on Azure has two components: the WEKA software licenses, and the cost of the Azure
infrastructure (compute VMs and blob storage) to support the WEKA environment. As we see in this example, the cost
of the WEKA license is typically 10% to 20% of the total solution cost, so a major consideration in the total solution cost
for a WEKA deployment is the choice of underlying infrastructure. For example, a 55TB all-flash WEKA environment
running on Azure would consist of 18 Azure VMs using L16s_v3 VMs. The total solution cost for this configuration was
$15,239/mo and breaks down as $11,706/mo for the Azure VMs and $3,533/mo for the WEKA software.

Conclusion:
The WEKA Data Platform when running in Azure the choice of Azure virtual machines to support the WEKA
outperforms Azure NetApp Files on a per provisioned cluster for a given workload. The cost of the
TB basis across tests for IOPs, throughput and latency. WEKA licensing remains constant on a per TB basis.
If we scale up the flash capacity, performance will improve
significantly as well. The WEKA solution can deliver this WEKA customers can reduce their total solution cost

level of performance at 70% less cost ($273$/TB for by combining a modest size flash tier on Azure VMs for

ANF versus $85/TB for WEKA). performance with a large capacity tier in Azure Blog
storage. In this example, WEKA customers can save
We compared the publicly available performance data for significant amounts both on an absolute and per TB basis.
Azure NetApp Files versus the performance of a minimal In this example by deploying a 5TB flash tier and 50TB
6 VM configuration for WEKA running on Azure. We found Azure Blob in a single WEKA is a fraction of the cost of
that on a per TB basis, WEKA outperforms ANF by a factor ANF both on an absolute ($14k/mo for ANF versus $4k/
of 4x (64k sequential writes) to 12x (8k random reads). mo for WEKA) and a per TB basis ($267/TB/mo for ANF
When we compare the total solution costs for the tested versus $85/TB/mo for WEKA. Further, WEKA provides
configurations, we found that both on an absolute and a the ability to non-disruptively scale the performance and
per TB basis, WEKA is 70% more affordable. capacity tiers independently. This gives the customers the
flexibility to select an Azure VM instance and pricing terms
The cost comparison between WEKA on Azure and ANF that best their application profile.
comes down to two variables for the WEKA environment:
the amount of capacity in Azure Blob object storage and

The Results Are Clear


If you need the highest performing storage in Azure or need
to scale to exabytes while maintaining good cost controls,
the WEKA Data Platform in Azure is the best choice.

8
WH I TE PAPE R 9

Notes and sources of price data (all prices shown are list prices as of Feb 2023):

• Azure NetApp Files Pricing

− Premium Tier: $0.29419/GiB/mo

• Azure VM pricing

− L8s_v3 VM pricing (3-yr savings plan): $256.94/mo (as of 2/14/2023)

− L16s_3 VM pricing (3-yr savings plan): $513.87/mo

• Azure Blob storage pricing

− US East 1: $0.26/GB/mo (prices vary by Azure Region)

− For tiering: WEKA recommends the Azure Blob Hot performance profile with ZRS redundancy.

• WEKA licenses

− WEKA NVMe license: $41.70/TB/mo

− WEKA Object license: $2.50/TB/mo

− Data transfer costs: $25-100/month, because of how WEKA creates objects in blob, we estimate that with
a change rate of 10% of the data per month going into the blob tier, we would see between $25-100/month
of operations and data transfer costs.

1 Performance metrics provided are based on internal WEKA testing. Performance for a specific customer deployment will vary depending on specific workload and
infrastructure conditions in a given customer environment.
2 Solution costs provided in this paper are for illustrative purposes only and should not be considered a price for a new deployment. Costs are based on published pricing
data as of Feb 2023.
3 Total Solution performance numbers are estimated based on the observed scaling of WEKA on Azure during our internal testing.
4 Solution costs provided in this paper are for illustrative purposes only and should not be considered a price for a new deployment. Costs are based on published pricing
data as of Feb 2023.

weka.io 844.392 .0665

© 2019-2022 All rights reserved. WekaIO, WekaFS, WIN, Weka Innovation Network, the Weka brand mark, the Weka logo, and Radically Simple Storage are trademarks of WekaIO, Inc. and its affiliates in the United
States and/or other countries. Other trademarks are the property of their respective companies. References in this publication to WekaIO’s products, programs, or services do not imply that WekaIO intends to make
these available in all countries in which it operates. Product specifications provided are sample specifications and do not constitute a warranty. Information is true as of the date of publication and is subject to change.
Actual specifications for unique part numbers may vary. WKA345-01 03/2023

You might also like