2022 Streaming Summit Netflix
2022 Streaming Summit Netflix
● FreeBSD-current
● NGINX web server
● Video served via asynchronous
sendfile(2) and encrypted using kTLS
Timeline:
CPU
100GB/s 100GB/s
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous Sendfile Performance
● Intel Xeon E5-2697v2
○ 12 cores @ 2.7GHz
○ 256GB DDR3-800
○ Chelsio T580 40GbE
● 23Gbs -> 36Gb/s
Timeline:
100GB/s
100GB/s
100GB/s
100GB/s 100GB/s 100GB/s
Socket Buffer
Internet
Disks
Asynchronous sendfile
Socket Buffer
Internet
Disks
Asynchronous sendfile + kTLS
Socket Buffer
Internet
Disks
Asynchronous sendfile + kTLS
Socket Buffer
Internet
Disks
Asynchronous sendfile + kTLS
Socket Buffer
Internet
Disks
Asynchronous sendfile + kTLS
Socket Buffer
Internet
Disks
Asynchronous sendfile + kTLS
Socket Buffer
Internet
Disks
Netflix 800Gb/s Video Serving Data Flow
Bulk Data
Using sendfile and software kTLS,
data is encrypted by the host CPU.
Metadata
800Gb/s == 100GB/s
100GB/s
is needed to serve 800Gb/s
NON-UNIFORM
● Each core has
Memory Memory
unequal access to CPU CPU
memory
● Each core has
unequal access to Network Card Network Card
I/O devices
Present day NUMA:
Node 0 Node 1
called a
“NUMA Domain” or
Memory Memory
“NUMA Node” CPU CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
● 4 NUMA crossings
● 400GB/s of data on the NUMA fabric
○ Fabric saturates, cannot handle the load.
○ CPU Stalls, saturates early
Dual AMD: Best Case Data Flow
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
● 0 NUMA crossings
● 0GB/s of data on the NUMA
fabric
Impose order on the chaos..
somehow:
● Disk centric siloing
○ Try to do everything on the NUMA node where
the content is stored
● Network centric siloing
○ Try to do as much as we can on the NUMA
node that the LACP partner chose for us
Dual AMD: Worst Case Data Flow
With Network Centric NUMA Siloing
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
800Gb/s == 100GB/s
100GB/s
is needed to serve 800Gb/s
800Gb/s == 100GB/s
100GB/s
is needed to serve 800Gb/s
800Gb/s == 100GB/s
100GB/s
is needed to serve 800Gb/s
800Gb/s == 100GB/s
100GB/s 100GB/s
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
TCP segments of Encrypted TLS Record
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
TCP segments of Encrypted TLS Record
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE Network
TCP segments of Encrypted TLS Record
Host Memory
15928 14480 13032 11584 10136 8688 7240 5792 4344 2896 1448 0
PCIe
Bus
NIC
100GbE 14480
Network
TCP segments of Encrypted TLS Record
Timeline: