Ceph Performance Profiling and Reporting

CEPH PERFORMANCE
Profiling and Reporting
Brent Compton, Director Storage Solution Architectures
Kyle Bader, Sr Storage Architect
Veda Shankar, Sr Storage Architect

HOW WELL CAN CEPH
PERFORM?
WHICH OF MY
WORKLOADS CAN IT
HANDLE?
HOW WILL CEPH
PERFORM ON MY
SERVERS?
INSERT DESIGNATOR, IF NEEDED2
Questions that continually surface
FAQ FROM THE COMMUNITY

PERCEIVED RANGE
OF CEPH PERF
ACTUAL (MEASURED) RANGE
OF CEPH PERF
Finding the right server and network config for the job
HOW WELL CAN CEPH PERFORM?

https://github.com/ceph/ceph-brag (email pmcgarry@redhat.com for access)
Ceph performance leaderboard (ceph-brag) coming to ceph.com
INVITATION TO BE PART OF THE ANSWER

Posted throughput results
A LEADERBOARD FOR CEPH PERF RESULTS

Looking for Beta submitters prior to general availability on Ceph.com
LEADERBOARD ATTRIBUTION AND DETAILS

Still under construction
EMERGING LEADERBOARD FOR IOPS

OpenStack Starter
64 TB
S
256TB +
M
1PB +
L
2PB+
MySQL Perf Node
IOPs optimized
Digital Media Perf Node
Throughput
optimized
Archive Node
Cost-Capacity
optimized
MAPPING CONFIGS TO WORKLOAD IO
CATEGORIES

Some pertinent measures
• MBps
• $/MBps
• MBps/provisioned-TB
• Watts/MBps
• MTTR (self-heal from server failure)
Range of MBps measured with Ceph on different server configs
DIGITAL MEDIA PERF NODES
0
100
200
300
400
500
HDD
sample
SSD
sample
4M Read
MBps per
Drive
4M Write
MBps per
Drive

Sequential Read Throughput vs IO Block Size
THROUGHPUT PER OSD DEVICE (READ)
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
90.00
100.00
64 512 1024 4096
MB/secperOSDDevice
IO Block Size
D51PH-1ULH - 12xOSD+3xSSD, 2x10G (3xRep)
D51PH-1ULH - 12xOSD+0xSSD, 2x10G (EC3:2)
T21P-4U/Dual - 35xOSD+2xPCIe, 1x40G (3xRep)
T21P-4U/Dual - 35xOSD+2xPCIe, 1x40G (EC2:2)
T21P-4U/Dual - 35xOSD+0xSSD, 10G+10G (EC2:2)

Sequential Write Throughput vs IO Block Size
THROUGHPUT PER OSD DEVICE (WRITE)
0.00
5.00
10.00
15.00
20.00
25.00
64 512 1024 4096
MB/secperOSDDevice
IO Block Size
D51PH-1ULH - 12xOSD+3xSSD, 2x10G (EC3:2)
T21P-4U/Dual - 35xOSD+2xPCIe, 1x40G
(3xRep)
(EC2:2)
T21P-4U/Dual - 35xOSD+0xSSD, 10G+10G
(EC2:2)

Sequential Throughput vs Different Server Sizes
SERVER SCALABILITY
0
10
20
30
40
50
60
70
80
90
100
12 Disks / OSDs (D51PH) 35 Disks / OSDs (T21P)
MBytes/sec/disk
Rados-4M-seq-read/Disk
Rados-4M-seq-write/Disk

Sequential Throughput vs Different Protection Methods (Replication v. Erasure-coding)
DATA PROTECTION METHODS
0
10
20
30
40
50
60
70
80
90
100
Rados-4M-Seq-Reads/disk Rados-4M-Seq-Writes/disk
MBytes/sec/disk
D51PH-1ULH - 12xOSD+0xSSD,
2x10G (EC3:2)
2x10G (EC3:2)
2x10G (3xRep)

Sequential IO Latency vs Different Journal Approaches
JOURNALING
0
500
1000
1500
2000
2500
3000
3500
4000
Rados-4M-Seq-Reads Latency Rados-4M-Seq-Writes Latency
Latencyinmsec
(3xRep)
(3xRep)

Sequential Throughput vs Different Network Bandwidth
NETWORK
0
10
20
30
40
50
60
70
80
90
100
Rados-4M-Seq-Reads/disk Rados-4M-Seq-Writes/disk
MBytes/sec/disk
(3xRep)
T21P-4U/Dual - 35xOSD+2xPCIe, 10G+10G
(3xRep)

Sequential Throughput v. Different OSD Media Types (All-flash v. Magnetic)
MEDIA TYPE
16

Different Configs vs $/MBps (lowest = best)
PRICE/PERFORMANCE
17
$/MBps
Price/Perf (w)
Price/Perf (r)

Different Configs vs $/MBps (lowest = best)
PRICE/PERFORMANCE
18
$/MBps
Price/Perf (w)
Price/Perf (r)

Some pertinent measures
• MySQL Sysbench requests/sec
• IOPS (4K, 16K random)
• $/IOP
• IOPS/provisioned-GB
• Watts/IOP
Range of IOPS measured with Ceph on different server configs
MYSQL PERF NODES
0
10000
20000
30000
40000
50000
60000
HDD
sample
SSD
sample
4K Read
IOPS per
Drive
4K Write
IOPS per
Drive

AWS provisioned-IOPS v. Ceph all-flash configs
SYSBENCH REQUEST/SEC
20
0
10000
20000
30000
40000
50000
60000
70000
80000
P-IOPS
m4.4XL
Ceph cluster
cl: 16 vcpu/64MB
(1 instance,
14% capacity)
Ceph cluster
cl: 16 vcpu/64MB
(10 instances,
87% capacity)
Sysbench Read Req/sec
Sysbench Write Req/sec
Sysbench 70/30 R/W
Req/sec

AWS use of IOPS/GB throttles
GETTING DETERMINISTIC IOPS
21
0.0
5.0
10.0
15.0
20.0
25.0
30.0
35.0
P-IOPS
m4.4XL
P-IOPS
r3.2XL
GP-SSD
r3.2XL
MySQL IOPS/GB, Sysbench Reads
MySQL IOPS/GB, Sysbench Writes

Ceph IOPS/GB varying with instance quantity and cluster capacity utilization
MYSQL INSTANCES AND CLUSTER
CAPACITY
22
26
87
19
0
10
20
30
40
50
60
70
80
90
100
P-IOPS
m4.4XL
Ceph cluster
cl: 16 vcpu/64MB
(1 instance,
14% capacity)
Ceph cluster
cl: 16 vcpu/64MB
(10 instances,
87% capacity)

Collect baseline measures
METHODOLOGY: BASELINING
1. Determine benchmark measures most representative of business need
2. Determine cluster access method (block, object, file)
3. Collect baseline measures
1. Look-up manufacturer drive specifications (IOPS, MBps, latency)
2. Single-node IO baseline (max IOPS, MBps to all drives concurrently)
3. Network baseline (consistent bandwidth across full route mesh)
4. Rados baseline (max sequential throughput per drive)
5. RBD baseline (max IOPS per drive)
6. Sysbench baseline (max DB requests/sec per drive)
7. RGW baseline (max object OP/sec per drive)
4. Calculate drive efficiency at each level up the stack

Towards deterministic performance
METHODOLOGY: WATERMARKS
1. Identify IOPS/GB at 35% and 70% cluster utilization (with corresponding MySQL instances)
2. Identify MBps/TB at 35% and 70% cluster utilization
3. Determine target IOPS/GB or MBps at target cluster utilization
4. (experimential) Set block device IO throttles to cap consumption by any single client

Towards comparable results
COMMON TOOLS
1. CBT – Ceph Benchmarking Tool

https://github.com/ceph/ceph-brag (email pmcgarry@redhat.com for access)
Ceph performance leaderboard (ceph-brag) coming to ceph.com
INVITATION TO BE PART OF THE ANSWER

plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
THANK YOU

4K Random Write IOPS v. Different Controllers and software configs
RAID CONTROLLER WRITE-BACK (HDD
OSDS)
28

Ceph Performance Profiling and Reporting

Recommended

More Related Content

What's hot (20)

Viewers also liked (7)

Similar to Ceph Performance Profiling and Reporting (20)

Recently uploaded (20)

Ceph Performance Profiling and Reporting