RHGS QCT Config Size Guide Technology Detail
RHGS QCT Config Size Guide Technology Detail
ABSTRACT
As a software-defined scale-out storage solution, Red Hat® Gluster Storage has emerged as a
compelling platform for distributed file services in the enterprise. Those deploying Gluster can
benefit from having simple cluster configurations, optimized for different file service loads. For
example, workloads based on small files can often benefit from different kinds of underlying
QCT (Quanta Cloud Technology) hardware than workloads based on large files. To address the need for performance and sizing
offers a family of servers for guidance, Red Hat and QCT (Quanta Cloud Technology) have performed extensive testing to
building different types of characterize optimized configurations for deploying Red Hat Gluster Storage on several QCT
scale-out storage clusters based servers.
on Red Hat Gluster Storage—
each optimized to suit different TABLE OF CONTENTS
workload and budgetary needs.
1 INTRODUCTION................................................................................................................. 3
Red Hat Gluster Storage offers a 2 WORKLOAD-OPTIMIZED DISTRIBUTED FILE SYSTEM CLUSTERS ........................... 4
range of distributed file storage 3 REFERENCE ARCHITECTURE ELEMENTS..................................................................... 5
solutions, supporting both
R
ed Hat Gluster Storage............................................................................................................................... 5
standard and dense storage
server configurations. Q
CT servers for Gluster................................................................................................................................. 7
redhat.com Note: The recommendations in this guide pertain to Red Hat Gluster Storage release 3.1.2.
Future enhancements may alter performance and corresponding recommendations.
6 TESTED CONFIGURATIONS............................................................................................. 17
Testing approach.............................................................................................................................................17
Q
uantaGrid D51PH-1ULH configuration.................................................................................................... 18
Q
uantaPlex T21P-4U configuration........................................................................................................... 19
S
oftware configuration.................................................................................................................................20
7 PERFORMANCE SUMMARY............................................................................................ 20
Jumbo files: Designing for optimal throughput.....................................................................................20
S
mall and medium files: Designing for optimal file operations per second................................... 21
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 2
INTRODUCTION
With the rapidly escalating need for distributed file storage, enterprises of all kinds are seeking
to emulate efficiencies achieved by public cloud providers — with their highly successful software-
defined cloud data-center models based on standard servers and open source software. At the
same time, the $35 billion storage market is undergoing a fundamental structural shift, with storage
capacity returning to the server following decades of external network-attached storage (NAS) and
storage area network (SAN) growth1. Software-defined scale-out storage has emerged as a viable
alternative, where standard servers and independent software unite to provide data access and
highly available services across the enterprise.
The combination of QCT servers and Red Hat Gluster Storage software squarely addresses these
industry trends, and both are already at the heart of many public cloud datacenters2. QCT is rein-
venting data-center server technology to boost storage capacity and density, and redesigning
scalable hardware for cloud applications. As the world’s largest enterprise software company with
an open source development model, Red Hat has partnered with many Fortune 100 companies to
provide Gluster storage software in production environments. Together, QCT servers and
Red Hat Gluster Storage provide software-defined storage solutions for both private and public
clouds, helping to accelerate the shift away from costly proprietary external storage solutions.
Proprietary hardware-based storage segregates information, making it hard to find, access, and
manage. Moreover, adding capacity to traditional storage systems often disrupts access to data. If
hardware fails, it can bring the business to a standstill. In contrast, Red Hat Gluster Storage is open,
software-defined file storage that scales out. Organizations can easily and securely manage large,
unstructured, and semi-structured data at a fraction of the cost of traditional, monolithic storage.
Importantly, only Red Hat lets organizations deploy the same storage services on premise; in private,
public, or hybrid clouds; and in Linux® containers.
Running Red Hat Gluster Storage on QCT servers provides open interaction with a community-based
software development model, backed by the 24x7 support of the world’s most experienced open
source software company. Use of standard hardware components helps ensure low costs, while
QCT’s innovative development model enables organizations to iterate more rapidly on a family
of server designs optimized for different types of Gluster workloads. Unlike monolithic scale-up
storage solutions, Red Hat Gluster Storage on QCT servers lets organizations scale out flexibly, with
the ability to scale storage performance and capacity independently, depending on the needs of the
application and the chosen storage server platform.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 3
WORKLOAD-OPTIMIZED DISTRIBUTED FILE SYSTEM CLUSTERS
One of the benefits of scale-out storage solutions is their ability to be tailored to different work-
loads. Red Hat Gluster Storage on QCT servers can be easily optimized and sized to serve specific
workloads through a flexible choice of systems and components. By carefully choosing and configur-
ing underlying server hardware, Red Hat Gluster storage can be easily configured to serve different
kinds of file storage. Multiple combinations are possible by varying the density of the server (stan-
dard or dense storage servers), the layout of the underlying storage (RAID 6 or just a bunch of disks
(JBOD) mode), the data protection scheme (replication or erasure coding), and the storage architec-
ture (standalone or tiered storage).
• Replicated volumes on RAID 6 bricks are commonly used for performance-optimized configura-
tions, independent of file size.
• Erasure-coded volumes on JBOD bricks are often more cost-effective for large-file archive
situations.
• Standard servers are often more performant and cost-effective for smaller clusters and all small-
file applications, while dense storage servers are often more cost-effective for larger clusters.
• Depending on file size, tiering with either solid state drives (SSDs) or NVM Express (NVMe) SSDs
installed in storage servers can provide significant benefits, especially for read performance.
Table 1 provides generally optimal Red Hat Gluster Storage pool configuration recommendations.
These categories are provided as guidelines for hardware purchase and configuration decisions,
and can be adjusted to satisfy unique workload blends of different operators. As the workload mix
varies from organization to organization, actual hardware configurations chosen will vary. The
Performance Summary section includes more detailed information on Red Hat and QCT testing.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 4
REFERENCE ARCHITECTURE ELEMENTS
The following sections discuss the overall architecture of the Red Hat and QCT reference architec-
ture, as well as key technical aspects of the principal components.
• Elasticity. With Red Hat Gluster Storage, storage volumes are abstracted from the hardware and
managed independently. Volumes can grow or shrink by adding or removing systems from the
storage pool. Even as volumes change, data remains available without application interruption.
• Petabyte scalability. Today’s organizations demand scalability from terabytes to multiple pet-
abytes. Red Hat Gluster Storage lets organizations start small and grow to support multi-pet-
abyte repositories as needed. Organizations that need very large amounts of storage can deploy
massive scale-out storage from the outset.
• High performance. Red Hat Gluster Storage provides fast file access by eliminating the typical
centralized metadata server. Files are spread evenly throughout the system, eliminating hot spots,
I/O bottlenecks, and high latency. Organizations can use commodity disk drives and 10+ Gigabit
Ethernet to maximize performance.
• Reliability and high availability. Red Hat Gluster Storage provides automatic replication that
helps ensure high levels of data protection and resiliency. For customers that are disk space con-
scious and would like integrated data protection without replication or RAID 6, Gluster also sup-
ports erasure coding that also provides for faster rebuild times. In addition to protecting from
hardware failures, self-healing capabilities restore data to the correct state following recovery.
• Industry-standard compatibility. For any storage system to be useful, it must support a broad
range of file formats. Red Hat Gluster Storage provides native POSIX file system compatibility
as well as support for common protocols including CIFS, NFS, and Hypertext Transfer Protocol
(HTTP). The software is readily supported by off-the-shelf storage management software.
• Unified global namespace. Red Hat Gluster Storage aggregates disk and memory resources into
a single common pool. This flexible approach simplifies management of the storage environment
and eliminates data silos. Global namespaces may be grown and shrunk dynamically, without inter-
ruption to client acess.
• Rapid and random access to archive tiers. Unlike archival systems based on tape, Red Hat Gluster
Storage offers automated data movement between hot and cold tiers. Additionally, archived data
can be both accessed and recovered rapidly from the cold disk tier.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 5
Scale-out performance, capacity, and availability
On premise
Figure 1. Red Hat Gluster Storage combines server and storage resources in a centrally managed pool with
independent capacity and performance scalability.
From a technical perspective, Red Hat Gluster Storage provides distinct advantages over other tech-
nologies, including:
• Software-defined storage. Red Hat believes that storage is a software problem that cannot be
solved by locking organizations into a particular storage hardware vendor or a particular hard-
ware configuration. Instead, Red Hat Gluster Storage is designed to work with a wide variety of
industry-standard storage, networking, and compute server solutions.
• Open source. Red Hat believes that the best way to deliver functionality is by embracing the open
source model. As a result, Red Hat users benefit from a worldwide community of thousands of
developers who are constantly testing the product in a wide range of environments and workloads,
providing continuous and unbiased feedback to other users.
• Complete storage operating system stack. Red Hat Gluster Storage delivers more than just a
distributed file system. The complete storage solution adds distributed memory management, I/O
scheduling, software RAID, self-healing, local N-way synchronous replication, and asynchronous
long-distance replication via Red Hat Gluster Geo-Replication.
• User space. Unlike traditional file systems, Red Hat Gluster Storage operates in user space, rather
than kernel space. This innovation makes installing and upgrading Red Hat Gluster Storage signifi-
cantly easier, and greatly simplifies development efforts since specialized kernel experience is not
required.
• Modular, stackable architecture. Red Hat Gluster Storage is designed using a modular and stack-
able architecture approach. Configuring Red Hat Gluster Storage for highly specialized environ-
ments is a simple matter of including or excluding particular modules.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 6
• Data stored in native formats. With Red Hat Gluster Storage, data is stored on disk using native
formats (XFS) with various self-healing processes established for data. As a result, the system is
extremely resilient and files will always stay naturally readable, even without the Red Hat Gluster
Storage software. There is no proprietary or closed format used for storing file data.
• No metadata with the elastic hash algorithm. Unlike other storage systems with a distributed file
system, Red Hat Gluster Storage does not create, store, or use a separate index of metadata on a
central server. Instead, Red Hat Gluster Storage places and locates files algorithmically. The per-
formance, availability, and stability advantages of this approach are significant, and in some cases
produce dramatic improvements.
• QCT QxStor RGT-200 or RGC-200 “standard” servers. Based on the QCT D51PH-1ULH or
D51B-2U servers, these configurations are delivered with 12 hot-swappable disk drives. The
D51PH-1ULH supports an additional four hot-swappable SSDs in an ultra-compact one rack unit
(1U) package without sacrificing space for 12 disk drives. The D51B-2U is designed with complete
features to suit demanding workloads with flexibility. It provides expansion slots for NVMe SSDs
in addition to 12 HDDs. The innovative hot swappable drive design of the D51PH-1ULH and D51B-2U
servers means that no external cable management arm is required — significantly reducing system
deployment and rack assembly time. As a result, IT administrators can service drives with minimal
effort or downtime.
• QCT QxStor RCT-400 or RGC-400 “dense” servers. Based on the QCT T21P-4U dual-node server
capable of delivering up to 620TB of storage in just one system, these servers efficiently serve
the most demanding cloud storage environments. The servers maximize storage density to meet
the demand for growing storage capacity in hyperscale datacenters. Two models are available: a
single storage node can be equipped with 78 hard disk drives (HDDs) to achieve ultra-dense capac-
ity and low cost per gigabyte, or the system can be configured as dual nodes, each with 35 HDDs
to optimize rack density. Along with support for two PCIe Gen3 slots for PCIe-based SSDs, the
server offers flexible and versatile I/O expansion capacity. The servers feature a unique, innova-
tive screw-less hard drive carrier design to let operators rapidly complete system assembly, sig-
nificantly reducing deployment and service time.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 7
QCT QuantaGrid D51PH-1ULH QCT QuantaPlex T21P-4U
storage server storage server
Figure 2. QCT storage servers are ideal for running Red Hat Gluster Storage.
• QCT QuantaGrid D51PH-1ULH or D51B-2U server (standard server). Ideal for smaller-capacity
pools, the compact 1U QuantaGrid D51PH-1ULH server provides 12 hot-swappable disk drives and
support for four additional hot-swappable solid state drives (SSDs). The 2U QuantaGrid D51B-2U
server provides 12 hot-swappable disk drives and NVMe SSDs.
• QCT QuantaPlex T21P-4U server (dense server). The QuantaPlex T21P-4U server is configu-
rable as a single-node (up to 78 HDDs) or dual-node system (up to 35 HDDs per node), maximizing
storage density to meet the demand for growing storage capacity.
• RAID 6. Red Hat Gluster Storage has traditionally used RAID 6 for local back-end storage. RAID 6
aggregates read performance across multiple disks and also protects against the loss of up to two
physical disks within the server.
• JBOD. JBOD back ends are increasingly popular, especially for large-capacity scenarios. JBOD
also mitigates the risk of proprietary hardware RAID implementations on individual servers.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 8
CLIENT TYPES
Figure 3 illustrates how bricks are combined into volumes within a Red Hat Gluster Storage pool.
Individual clients can access volumes within the storage pool using these supported protocols:
• NFS (the Network File System). Originally developed by Sun Microsystems, Inc., NFS allows a user
on a client computer to access files over a network, much like local storage is accessed.
• CIFS (the Common Internet File System). Commonly used with Microsoft Windows clients, CIFS is
an enhanced version of the Microsoft Server Message Block (SMB) protocol, and one of the stan-
dard ways that computer users share files across corporate intranets and the Internet.
• Gluster Native Client. The Gluster Native Client utilizes FUSE (filesystem in user space), a loadable
kernel module for UNIX-like operating systems that lets non-privileged users create their own file
systems without editing kernel code
Virtual Physical
Red Hat
Gluster Storage CLI
Brick
Administrator
Server
Cloud
Volume Brick
Manager
NFS
Disks
CIFS
Brick
FUSE
OpenStack Swift
User
Red Hat
Gluster Storage Pool
Figure 3. Administrators and a range of clients access glusterFS bricks via the Cloud Volume Manager.
• Distributed volumes
• Distributed-replicated volumes
• Distributed-dispersed volumes
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 9
Given the flexibility of the Red Hat Gluster Storage platform, these volume types can then be com-
bined to create tiered storage architectures as well. Each of these volume types is described in the
sections that follow.
Distributed volumes
Distributed volumes spread files across the bricks in the volume. As shown in Figure 4, individual
files may be located on any brick in the distributed volume. Importantly, distributed volumes can
suffer significant data loss during a disk or server failure, since directory contents are spread ran-
domly across the bricks in the volume. As such, distributed volumes should only be used where
scalable storage and redundancy are either not important, or are provided by other hardware of
software layers.
Distributed volume
Server 1 Server 2
Brick Brick
(exp1/brick) (exp2/brick)
Mount
point
Figure 4. Distributed volumes are subject to data loss during disk or server failures.
Replicated volumes
Replicated volumes create copies of files across multiple bricks in the volume, replicating them
between a number of servers to protect against disk and server failures. As such, replicated volumes
are suitable for environments where high availability and high reliability are critical. Both two-way
(Figure 5) and three-way (Figure 6) replicated volumes are supported in Red Hat Gluster Storage as
of this writing. Three-way replicated volumes are supported only on JBOD bricks.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 10
Replicated volume
Server 1 Server 2
Brick Brick
(exp1/brick) (exp2/brick)
Mount
point
File 1 File 2
Figure 5. Replicated volumes provide high availability by creating copies on multiple servers (two-way replication
shown).
Replicated volume
Mount
point
File 1 File 2
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 11
different servers. Two-way distributed replicated volumes are shown in Figure 7. Synchronous three-
way distributed replicated volumes (Figure 8) are now fully supported in Red Hat Gluster Storage (on
JBOD bricks only).
Distributed translator
File 1 File 2
Figure 7. Distributed replicated volumes interpose a distributed translator across replicated volumes.
Distributed volume
File 1 File 2
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 12
Dispersed volumes (Figure 9) require less storage space when compared to replicated volumes. For
this reason, they are often more cost-effective for larger capacity storage clusters. A dispersed
volume sustains the loss of data based on the redundancy level. For example, a dispersed volume
with a redundancy level of “2” is equivalent to a replicated pool of size two, but requires 1.5TB
instead of 2TB to store 1TB of data.
The data protection offered by erasure coding can be represented in simple form by the equation
n = k + m, where “n” is the total number of bricks, and the system would require any “k” bricks
for recovery. In other words, the system can tolerate failure up to any “m” bricks. Red Hat Gluster
Storage currently supports these dispersed volume configurations:
Dispersed volume
Mount
point 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB
File 1
10MB
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 13
Mount
point
Distributed translator
Brick 1 2 3 4 5 6 1 2 3 4 5 6
exp/brick
1/1 2/2 3/3 4/4 5/5 6/6 7/7 8/8 9/9 10/10 11/11 12/12
2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB 2.5MB
File 1 File 2
10MB 10MB
Figure 10. Distributed disbursed volumes interpose a distributed translator across two disbursed volumes.
Tiering monitors and identifies the activity level of the data and auto rebalances the active and inac-
tive data to the most appropriate storage tier. This process both improves the storage performance
and resource use. Moving data between tiers of hot and cold storage is a computationaly expensive
task. To address this, Red Hat Gluster Storage supports automated promotion and demoting of data
within a volume in the background, so as to minimize the impact on foreground I/O activity.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 14
The promotion and demotion of files between tiers is determined by how full the hot tier is. Data
becomes hot or cold based on the rate at which it is accessed, relative to high and low watermarks
set by the administrator. If access to a file increases, it moves, or retains its place in the hot tier. If
the file is not accessed in a while, it moves, or retains its place in the cold tier. As a result, data move-
ment can happen in either direction, based on the access frequency.
Figure 11 illustrates how tiering works. In this case, the existing slower distributed replicated volume
would become a cold tier while the new faster distributed replicated tier would serve as a hot tier.
Frequently-accessed files will be promoted from the cold tier to the hot tier for better performance.
If that data becomes subsequently unused, it will be demoted to the cold tier.
Mount point
Tiered volume
Promotion
Cold tier Hot tier
Demotion
Figure 11. Tiering lets a distributed replicated volume serve as a hot tier for distributed disbursed volume.
The promotion and demotion of files is also moderated by the fullness of the hot tier. Data accumu-
lates on the hot tier until it reaches the low watermark (default 75% of capacity), even if it is not
accessed for a period of time. This prevents files from being demoted unnecessarily when there is
plenty on free space on the hot tier. When the hot tier is fuller than the lower watermark but less
than the high watermark (default 90% capacity), data is randomly promoted and demoted where the
likelihood of promotion decreases as the tier becomes fuller. The opposite holds for demotion. If the
hot tier is fuller than the high watermark, promotions stop and demotions happen more frequently
in order to provide additional free space.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 15
QUALIFYING THE NEED FOR A SOFTWARE-DEFINED, DISTRIBUTED FILESYSTEM
Not every storage situation calls for scale-out storage. These requirements probably point to a good
fit for scale-out storage:
• Dynamic storage provisioning. By dynamically provisioning capacity from a pool of storage, orga-
nizations are typically building a private storage cloud, mimicking popular public cloud services.
• Standard storage servers. Scale-out storage employs storage clusters built from industry-stan-
dard x86 servers rather than proprietary storage appliances, allowing incremental growth of
storage capacity and/or performance without forklift appliance upgrades.
• Unified name-spaces. Scale-out storage allows pooling storage across up to 128 storage servers
in one or more unified name-spaces.
• High data availability. Scale-out storage provides high-availability of data across what would oth-
erwise be storage silos within the storage cluster.
• Independent multidimensional scalability. Unlike typical NAS and SAN devices that may exhaust
throughput before they run out of capacity, scale-out storage allows organizations to add storage
performance or capacity incrementally by independently adding more storage servers or disks as
required.
3 Note that due to locking incompatibility, CIFS cannot be used with other client access methods.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 16
SELECTING A DATA PROTECTION METHOD
Gluster offers two data protection schemes: replication and erasure coding (dispersed volumes in
Gluster parlance). As a design decision, choosing the data protection method can affect the solu-
tion’s total cost of ownership (TCO) more than any other factor, while also playing a key role in deter-
mining cluster performance. The chosen data protection method strongly affects the amount of raw
storage capacity that must be purchased to yield the desired amount of usable storage capacity and
has particular performance trade-offs, dependent upon the workload.
Fault domain risk includes accommodating the impact on performance. When a drive fails in a
RAID 6 back-end volume, Gluster is unaware, as hardware RAID technology masks the failed drive
until it can be replaced and the RAID volume re-built. During Gluster self-healing, a percentage of
volume throughput capacity will be diverted to healing outdated file copies on the failed node from
the file replicas on the surviving nodes. The percentage of performance degradation is a function of
the number and size of files that changed on the failed node while it was down, and how Gluster is
configured. If a node must be replaced, all file replicas assigned to this node must be copied from the
surviving replica or reconstituted from the disperse set.
• Supported minimum cluster size: Two nodes with a third non-storage node to constitute quorum.
TESTED CONFIGURATIONS
Two separate cluster configurations were constructed and tested by Red Hat and QCT.
TESTING APPROACH
Red Hat and QCT testing exercised file create and read operations using file sizes of 50KB (jpeg),
5MB (mp3), and 4GB (DVD). Two separate benchmarking tools were used to exercise the glusterFS
file system in Red Hat and QCT testing:
• IOzone. IOzone (www.iozone.org) was used to test the sequential read/write performance of the
GlusterFS volumes. IOzone is a file system benchmark tool that generates and measures a variety
of file operations. IOzone’s cluster mode option is particularly well-suited for distributed storage
testing because testers can start many worker threads from various client systems in parallel, tar-
geting the GlusterFS volume. In testing, 16 client systems each ran eight IOzone threads (128 total
threads of execution) for a four-node storage cluster.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 17
complement use of the IOzone benchmark for measuring performance of small- and large-file
workloads. The benchmark returns the number of files processed per second and the rate that the
application transferred data in megabytes per second.
• SAS controller: QCT SAS Mezz (LSI 3008 for JBOD, LSI 3108 for RAID 6)
• Network controller: QCT Intel 82599ES dual-port 10 GbE SFP+ OCP mezzanine, or QCT Intel X540
dual-port 10 GbE BASE-T OCP mezzanine
• Optional SSD for tiering: 4x Intel SSD Data Center (DC) S3710 200 GB, 2.5-inch SATA 6Gb/s, MLC
QuantaMesh T3048-LY8
10 GbE Cluster Network
D51PH-1ULH
(12 HDDs and 4 SSDs per node)
16 client nodes
Figure 12. Six standard QuantaGrid D51PH-1ULH servers were tested with back end and public 10 GbE networks.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 18
QUANTAPLEX T21P-4U CONFIGURATION
As shown in Figure 13, two 4U QuantaPlex T41P-4U/Dual Node servers were connected, each with
dual 40 GbE interfaces (one per node) to a shared public network. Sixteen client nodes were likewise
attached to the public network via 10 GbE for load generation. Each QuantaPlex T41P-4U/Dual Node
server was configured with:
• SAS controller: (2x 1) QCT SAS Mezz LSI 3108 SAS controller
• Network controller: (2x 1) QCT Mellanox ConnectX-3 EN 40 GbE SFP+ single-port OCP mezzanine
• Onboard storage: (2x 1)Intel SSD DC S3510 120 GB, 2.5-inch SATA 6 Gb/s, MLC flash
• Hard disk drives: (2x 35) Seagate 3.5-inch SAS 6TB 7.2K RPM
• Optional SSD for tiering: (2x 2) Intel SSD DC P3700 800GB, 1/2 hight PCIe 3.0, MLC
QuantaMesh T3048-LY8
40GbE 10GbE
Public Network
T21P-4U
Four Gluster nodes
(35 HDDs and two
NVMe SSDs per node)
16 client nodes
Figure 13. Two QuantaPlex T21P-4U/Dual Node servers (four Gluster nodes) were configured, each node with a
40 GbE interface, with the configuration driven by 16 client nodes.
For the tiered configuration testing, the cold tier was configured to use three RAID 6 bricks on each
of the four T21P-4U servers. The hot tier was configured to use the two NVMe SSDs on each server.
The Gluster volume for both the hot and cold tier are configured using two-way replication. The
Gluster volume settings that are best suited for this configuration are listed in Appendix B.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 19
SOFTWARE CONFIGURATION
Server systems were configured with this storage server software:
PERFORMANCE SUMMARY
To characterize performance, Red Hat and QCT ran a series of tests across different cluster configu-
rations, varying file sizes (jumbo, medium, and small), server density (standard vs. dense) the ways
that the bricks were defined (JBOD vs. RAID6), and data protection schemes (replication vs. erasure
coding). For medium and small file testing, tiered configurations were also tested, utilizing SSDs at
varying degrees of fullness.
Write
Standard (2x Rep) RAID 6 bricks
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 20
JBOD versus RAID6 bricks with erasure-coded volumes
Results show large file sequential IO performance to be dramatically better on JBOD bricks vs.
RAID6 bricks, when using erasure-coded data protection. This is made clear by examining the first
and fifth data sets in Figure 14 (standard servers, EC4:2/RAID6 versus EC4:2/JBOD).
Figure 15. Standard servers with 4:2 erasure coding and JBOD bricks provide superior value in terms of throughput
per server cost.
SMALL AND MEDIUM SIZED FILES: DESIGNING FOR OPTIMAL FILE OPERATIONS PER
SECOND
Figure 16 shows the results of testing using small (50KB) files against a variety of configurations.
Files of this size might equate to a small JPEG file. Testing on small files was done to evaluate the
performance effects of various tiering configurations on reads and writes. 5
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 21
Standard, tiered (1x NVMe/svr)
Dense, no tiering
Figure 16. Small file testing utilized 50KB files against a range of tiered and non-tiered configurations.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 22
Effects of filling the hot tier on performance
Reflected in the third data set of Figure 16, the testing also evaluated the effect of starting with the
hot tier at 70% capacity, just below the default 75% low watermark. As expected, both read and
write performance drop materially with a substantially full hot tier. As the top tier fills, the systems
begins to aggressively demote and flush to disk, more heavily utilizing the cold tier.
Read
Standard, tiered (4x SSD/svr) Create
Standard, no tiering
Dense, no tiering
Figure 17. For small files, standard tiered servers provide the greatest value in terms of read throughput.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 23
Dense, tiered (2x NVMe/svr)
Dense, no tiering
Standard, no tiering
Read
Create
0 2 4 6 8 10 12 14
Figure 18. For medium files, dense tiered servers provide the greatest read throughput.
Dense, no tiering
Figure 19. For medium-sized files, dense tiered servers provide the greatest value.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 24
Performance under self-healing
GlusterFS has a built-in daemon to automate self-healing functionality in a case where a brick is
taken off line and then returned to the pool. This functionality is closely related to volume rebalanc-
ing, which redistributes files as bricks are added to the pool. Self-healing does have a relative effect
on overall performance, depending on the size of the pool and number of files that changed while a
brick was off-line.
Figure 20 shows the impact of self-heal operation on a replicated volume. The graph shows the
negative impact on read performance due to self-heal operation when the read data set includes
files that are being healed. Importantly, the relative impact on performance decreases as the size
of the cluster increases because each storage node now constitutes a smaller portion of the entire
glusterFS volume.
7000
6000
Throughput in MB/sec/pool
5000
4000
1MB blk size seq read - baseline
3000
1MB blk size seq read - during self-heal
2000
1000
0
D51PH-1ULH 1x2 D51PH-1ULH 2x2 D51PH-1ULH 3x2
(two-node pool) (four-node pool) (six-node pool)
Figure 20. Self-healing on smaller cluster sizes consumes a larger percentage of overall cluster resources.
Figure 21 compares the impact of the self-heal operation on a replicated volume both when the
client workload read data set is and is not included in the data to be self-healed. The graph shows
that there is no negative impact from the self-heal process when the data set read does not include
any of the files that are being healed. Of course, performance is also determined by the available
network capacity for client communication as some of the network bandwidth will be used by the
self-heal operation. Importantly, testing showed that there was no client workload performance
penalty for a dispersed volume during the read after a failed node recovery, when the client work-
load did not access files being healed.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 25
Sequential read (1M I/O)
5000
4500
4000
Throughput in MB/sec/cluster 3500
3000
2500
D51PH-1U 2x 2
2000
1500
1000
500
0
Baseline During self-heal, During self-heal,
reading files being healed reading files not being healed
Figure 21. There is no self-heal impact if the read set is not included in the files being healed.
100.00
90.00
80.00
70.00
60.00
MB/sec/drive
2x replication - D51PH-1ULH
12 disks/node RAID 6
50.00
Erasure coding (4:2) - D51PH-1ULH
40.00 12 disks/node JBOD
30.00
20.00
10.00
0.00
50KB files 5MB files 4GB files
Figure 22. Read throughput and cross over across different sized files.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 26
60.00
50.00
40.00
MB/sec/drive
2x Replication - D51PH-1ULH
12 disks/node RAID 6
30.00
Erasure coding 4:2 - D51PH-1ULH
12 disks/node JBOD
20.00
10.00
0.00
50KB files 5MB files 4GB files
Figure 23. Write throughput and cross over across different file sizes.
Overall, Red Hat and QCT testing has revealed a number of best practices for serving different sizes
of files.
• Small (50KB) files. Tiering provides a pronounced advantage for read-heavy workloads with
hot-file access patterns. Standard servers in a tiered configuration with a single SSD per server
provide both superior performance and value over dense servers. The cold tier should be repli-
cated, not erasure-coded, and adding additional SSDs for the hot tier is generally not necessary or
cost-effective.
• Medium (5MB) files. For medium-sized files, dense tiered servers with two SSDs per server
provide the best read performance and comparable write performance to other configurations.
These servers also provided the best value in terms of performance over cost.
• Jumbo (4GB) files. Unlike other software-defined storage solutions, erasure-coded and repli-
cated volumes provide similar performance for jumbo-sized files. Erasure coding should be used
on JBOD bricks only, as RAID6 combined with erasure coding provides inferior performance. From
a value perspective, standard servers with erasure coding and JBOD bricks represent a superior
value.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 27
APPENDIX A: GLUSTER-OPTIMIZED BUILDING BLOCKS FROM QCT
To help provide simplified deployment choices for Red Hat Gluster Storage, QCT is offering several
configurations (SKUs) optimized as Gluster servers. These configurations provide different control-
lers to help optimize them for different use cases.
• QxStor RGT-200(SF). Based on the QCT D51PH-1ULH or D51B-2U 12-bay server, this configura-
tion provides 12 HDDs and support for four SATA SSDs or one NVMe SSD. The configuration is
best suited when utilizing the Gluster tiering feature for small file workloads. The storage server
includes LSI 3108 RAID controller to support RAID 6 configuration for the HDDs.
• QxStor RGT-200(LF). Based on the QCT D51PH-1ULH or D51B-2U 12-bay server, this configuration
provides 12 HDDs and LSI 3108 RAID controller to support RAID 6 configuration for the HDDs. The
configuration offers high throughput and reliability with a small failure domain.
• QxStor RGC-200. Based on the QCT D51PH-1ULH or D51B-2U 12-bay server, this configuration pro-
vides 12 HDDs and an LSI 3008 RAID controller to support JBOD configuration for the HDDs. This
configuration is best suited for archival workloads of medium capacity where cost is more impor-
tant than performance.
• QxStor RGT-400(LF). Based on the QCT T21P-4U dual-node server with 35 drive bays for each
node, this configuration offers an LSI 3108 RAID controller to support RAID 6 configuration for the
HDDs. The configuration provides the capability to scale to a half-petabyte of raw capacity in only
four rack units. Organizations can obtain the best throughput and density simultaneously with this
configuration.
• QxStor RGC-400. Based on QCT T21P-4U dual-node server with 35 drive bays for each node, this
configuration offers an LSI 3008 RAID controller to support JBOD6 configuration for the HDDs.
The configuration provides the capability to scale to a half-petabyte of raw capacity in only four
rack units. This configuration is most cost-effective for deployments that are a petabyte in size or
larger.
These Gluster-optimized configurations can be used as building blocks to construct clusters in a wide
range of sizes, focused at serving either small or large files as shown in Table 2.
6 The 24 JBOD drive per server support limitation in Red Hat Gluster Storage 3.1.2 is under study for future releases.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 28
TABLE 2. WORKLOAD OPTIMIZED QCT QXSTOR CONFIGURATIONS.
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 29
APPENDIX B: TIERED GLUSTER VOLUME SETTINGS
The Gluster volume settings that follow were used in configuring the T21P-4U servers tiered configu-
ration testing.
First, the replicated Gluster volume was created for the cold tier:
qct135:/gluster/brick1/gvol62 \
qct136:/gluster/brick1/gvol62 \
qct137:/gluster/brick1/gvol62 \
qct138:/gluster/brick1/gvol62 \
qct135:/gluster/brick2/gvol62 \
qct136:/gluster/brick2/gvol62 \
qct137:/gluster/brick2/gvol62 \
qct138:/gluster/brick2/gvol62 \
qct135:/gluster/brick3/gvol62 \
qct136:/gluster/brick3/gvol62 \
qct137:/gluster/brick3/gvol62 \
qct138:/gluster/brick3/gvol62
Next, the hot tier was configured, comprised of two NVMe SSD drives per storage node:
qct135:/gluster/ssd0/tier4x2 \
qct136:/gluster/ssd0/tier4x2 \
qct137:/gluster/ssd0/tier4x2 \
qct138:/gluster/ssd0/tier4x2 \
qct135:/gluster/ssd1/tier4x2 \
qct136:/gluster/ssd1/tier4x2 \
qct137:/gluster/ssd1/tier4x2 \
qct138:/gluster/ssd1/tier4x2
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 30
Finally, the volume settings are configured:
Type: Tier
Status: Started
Number of Bricks: 20
Transport-type: tcp
Hot Tier :
Number of Bricks: 4 x 2 = 8
Brick1: qct138:/gluster/ssd1/tier4x2
Brick2: qct137:/gluster/ssd1/tier4x2
Brick3: qct136:/gluster/ssd1/tier4x2
Brick4: qct135:/gluster/ssd1/tier4x2
Brick5: qct138:/gluster/ssd0/tier4x2
Brick6: qct137:/gluster/ssd0/tier4x2
Brick7: qct136:/gluster/ssd0/tier4x2
Brick8: qct135:/gluster/ssd0/tier4x2
Cold Tier:
Number of Bricks: 6 x 2 = 12
Brick9: qct135:/gluster/brick1/gvol62
Brick10: qct136:/gluster/brick1/gvol62
Brick11: qct137:/gluster/brick1/gvol62
Brick12: qct138:/gluster/brick1/gvol62
Brick13: qct135:/gluster/brick2/gvol62
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 31
Brick14: qct136:/gluster/brick2/gvol62
Brick15: qct137:/gluster/brick2/gvol62
Brick16: qct138:/gluster/brick2/gvol62
Brick17: qct135:/gluster/brick3/gvol62
Brick18: qct136:/gluster/brick3/gvol62
Brick19: qct137:/gluster/brick3/gvol62
Brick20: qct138:/gluster/brick3/gvol62
Options Reconfigured:
performance.quick-read: off
performance.io-cache: off
cluster.lookup-optimize: on
server.event-threads: 4
client.event-threads: 4
cluster.tier-mode: cache
features.ctr-enabled: on
performance.readdir-ahead: on
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 32
redhat.com TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers 33
TECHNOLOGY DETAIL Performance and Sizing Guide: Red Hat Gluster Storage on QCT servers