SlideShare a Scribd company logo
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1
Enabling GPU-as-a-Service
Providers with Red Hat OpenShift
@jeremyeder
Senior Principal Software Engineer, Red Hat
March, 2018
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Agenda
● OpenShift Cluster Overview
● Infrastructure Abstraction
● High Performance Features
● GPU Overview
2
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Community Powered Innovation
3
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
What does an OpenShift Cluster look like?
SERVICE LAYER
ROUTING LAYER
PERSISTENT
STORAGE
REGISTRY
RHEL
NODE
C
C
RHEL
NODE
C C
RHEL
NODE
c
C
C
RHEL
NODE
C C
RHEL
NODE
C
RHEL
NODE
C
RED HAT
ENTERPRISE LINUX
MASTER
API/AUTHENTICATION
DATA STORE
SCHEDULER
HEALTH/SCALING
PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID
4
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Abstract away any infrastructure
SERVICE LAYER
ROUTING LAYER
PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID
● Bare Metal
● RHV
● OpenStack
● VMware
● GCE
● Azure
● AWS
● BYO nodes...
5
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 6
One Platform to...
OpenShift is the single platform
to run any application:
● Old or new
● Monolithic/Microservice
Big Data
NFV
FSI
Animation
ISVsHPC
Machine
Learning
6
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 7
High Performance RFEs by Vertical
Feature FSI NFV ISV BD/ML ANIM HPC
NUMA (cpuset.cpus and cpuset.mems) Yes Yes Yes Maybe Maybe Yes
Device Passthrough (NIC/Disk/GPU etc...) Yes Yes Yes Maybe Maybe Yes
sysctl Support (non-namespaced too) Yes Yes Yes Yes Yes Yes
Separation of control- and data-plane Yes Yes Yes Yes Yes Yes
Node “fitness” (extended health info) Yes Yes Maybe Maybe Maybe Yes
Multi-homed pods Yes Yes Maybe Yes Yes Yes
Kernel Modules (DKMS-ish) Yes Yes Maybe Maybe Yes Maybe
Hugepages Yes Yes Yes Yes Maybe Maybe
7
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Enable containerization of Infrastructure Software
● Software-defined Storage and Networking
● Packet switching and routing tiers
● Multi-workloads (very different) within a single cluster
○ Layered schedulers (HPC/grid)
● Many more...
Why do this?
8
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Gluster/Container Native Storage
● Ceph
● OpenStack
● rad analytics
● KubeVirt
Enable containerization of Red Hat’s products
9
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Resource Management Working Group
○ Features Delivered
■ Device Plugins (GPU/Bypass/FPGA)
■ CPU Manager (exclusive cores)
■ Huge Pages Support
○ Extensive Roadmap
● Intel, IBM, Google, NVIDIA, Red Hat, many more...
Upstream First: Kubernetes Working Groups
10
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Network Plumbing Working Group
○ Formalized Dec 2017
● Goal is to implement an out of tree, pseudo-standard collection of
CRDs for multiple networks, owned by sig-network, *out of tree*
● Separate control- and data-plane, Overlapping IPs, Fast Data-plane
● IBM, Intel, Red Hat, Huawei, Cisco, Tigera...at least.
Upstream First: Kubernetes Working Groups
11
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
GPU CLUSTER TOPOLOGY
12
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Control Plane
Compute Nodes and Storage Tier
Infrastructure
master
and etcd
master
and etcd
master
and etcd
registry
and
router
registry
and
router
LB
registry
and
router
OpenShift Cluster Topology
13
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special”
hardware
● Create Node Pools
○ Mark them as “special”
○ Taints/Tolerations
○ ExtendedResourceTole
ration
OpenShift Cluster Topology
14
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special”
hardware
● Tune/Configure the OS
○ Tuned Profiles
○ CPU Isolation
○ sysctls
OpenShift Cluster Topology
15
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Unsafe
● Experimental Kubelet Flag
● kernel.sem*
● kernel.shm*
● kernel.msg*
● fs.mqueue.*
● net.*
In OpenShift, there are three “types” of sysctls
Safe
● Enabled by default
● kernel.shm_rmid_forced
● net.ipv4.ip_local_port_range
● net.ipv4.tcp_syncookies
Node-level
● Can’t set from a pod
● Potentially affects other
pods
● Many interesting sysctls
● Use TuneD
16
OpenShift Cluster Topology
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special”
hardware
● Optimize your workload
○ Dedicate CPU cores
○ Consume hugepages
OpenShift Cluster Topology
17
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special”
hardware
● Enable the Hardware
○ Install drivers
○ Deploy Device Plugin
OpenShift Cluster Topology
18
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Compute Nodes...
● How to enable software to take advantage of “special”
hardware
● Consume the Device
○ KubeFlow Template
deployment
OpenShift Cluster Topology
19
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Kubernetes Deployment for STAC-A2
● All-in-One Kubernetes Installation
● (hack/local-up-cluster.sh)
● Node labeled
● Containers:
○ RHEL7+CUDA9
○ RHEL7+CUDA9+DEVICE-PLUGIN
○ RHEL7+CUDA9+STAC-A2
● CUDA 9
● 8 x NVIDIA Tesla V100 (Volta) GPUs
● HPE Apollo 6500 w/XL270d Gen9
● Red Hat Enterprise Linux 7.4
● Kubernetes 1.8 (setup info)
● nvidia-smi
--applications-clocks=877,1380
● https://rhelblog.redhat.com/2017/11/21/red-hat-and-partners-deliver-new-perf
ormance-records-on-prominent-risk-analytics-benchmark/
● https://news.developer.nvidia.com/a-new-stac-a2-record/
20
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 21
Kubernetes Deployment for STAC-A2
Volta GPU Kubelet
Device Plugin
(daemonset)
Kube Scheduler
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Benchmark (pod)
resources:
limits:
nvidia.com/gpu: 8
kubectl create
21
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Benchmark (pod)
resources:
limits:
nvidia.com/gpu: 8
22
Kubernetes Deployment for STAC-A2
Volta GPU Kubelet
Device Plugin
(daemonset)
Kube Scheduler
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
Volta GPU
kubectl create
22
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
● Early KubeFlow involvement
● radanalytics templates for ML-workflow on OpenShift
● Machine-Learning OpenShift Commons
● Demo Repositories
○ https://github.com/zvonkok/nvidia-k8s
○ https://github.com/redhat-performance/openshift-psap
Recent GPU-related work on OpenShift
23
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews
24
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Commoditizing GPU-as-a-Service Providers with Red Hat OpenShift
Tuesday, Mar 27, 1:00 PM - 1:25 PM, Room 210E
Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an
important role in building flexible hybrid cloud infrastructure. By abstracting
infrastructure away from developers, workloads become portable across any
cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as
from every computer maker, an abstraction library like OpenShift becomes even
more valuable. Through demonstrations, this session will introduce you to
declarative models for consuming GPUs via OpenShift, as well as the two-level
scheduling decisions that provide fast placement and stability.
25
Ad

More Related Content

What's hot (20)

Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2   Maximizing the utilization of GPU resources on-premise and in the cloudPart 2   Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Univa, an Altair Company
 
ceph openstack dream team
ceph openstack dream teamceph openstack dream team
ceph openstack dream team
Udo Seidel
 
02 ai inference acceleration with components all in open hardware: opencapi a...
02 ai inference acceleration with components all in open hardware: opencapi a...02 ai inference acceleration with components all in open hardware: opencapi a...
02 ai inference acceleration with components all in open hardware: opencapi a...
Yutaka Kawai
 
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and CephProtecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Sean Cohen
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
Alexander Kukushkin
 
Deploying OpenNebula in an HPC environment
Deploying OpenNebula in an HPC environmentDeploying OpenNebula in an HPC environment
Deploying OpenNebula in an HPC environment
CSUC - Consorci de Serveis Universitaris de Catalunya
 
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
LinuxCon NA 2016:  When Containers and Virtualization Do - and Don’t - Work T...LinuxCon NA 2016:  When Containers and Virtualization Do - and Don’t - Work T...
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
Jeremy Eder
 
OSCON 2017: To contain or not to contain
OSCON 2017:  To contain or not to containOSCON 2017:  To contain or not to contain
OSCON 2017: To contain or not to contain
Jeremy Eder
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Community
 
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
bipin kunal
 
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Drew Fustini
 
DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端
Hung-Yen Chen
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures
柏瑀 黃
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community Update
Danielle Womboldt
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on Kubernetes
Jimmy Angelakos
 
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
OpenEBS
 
OpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
OpenShift Commons Briefing: Ask Me Anything about Cinder and GlanceOpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
OpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
Brian Rosmaita
 
kpatch.kgraft
kpatch.kgraftkpatch.kgraft
kpatch.kgraft
Udo Seidel
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdw
Jan Holčapek
 
Make Accelerator Pluggable for Container Engine
Make Accelerator Pluggable for Container EngineMake Accelerator Pluggable for Container Engine
Make Accelerator Pluggable for Container Engine
LinuxCon ContainerCon CloudOpen China
 
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2   Maximizing the utilization of GPU resources on-premise and in the cloudPart 2   Maximizing the utilization of GPU resources on-premise and in the cloud
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
Univa, an Altair Company
 
ceph openstack dream team
ceph openstack dream teamceph openstack dream team
ceph openstack dream team
Udo Seidel
 
02 ai inference acceleration with components all in open hardware: opencapi a...
02 ai inference acceleration with components all in open hardware: opencapi a...02 ai inference acceleration with components all in open hardware: opencapi a...
02 ai inference acceleration with components all in open hardware: opencapi a...
Yutaka Kawai
 
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and CephProtecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Sean Cohen
 
Patroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companionPatroni: Kubernetes-native PostgreSQL companion
Patroni: Kubernetes-native PostgreSQL companion
Alexander Kukushkin
 
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
LinuxCon NA 2016:  When Containers and Virtualization Do - and Don’t - Work T...LinuxCon NA 2016:  When Containers and Virtualization Do - and Don’t - Work T...
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
Jeremy Eder
 
OSCON 2017: To contain or not to contain
OSCON 2017:  To contain or not to containOSCON 2017:  To contain or not to contain
OSCON 2017: To contain or not to contain
Jeremy Eder
 
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOceanCeph Tech Talk: Ceph at DigitalOcean
Ceph Tech Talk: Ceph at DigitalOcean
Ceph Community
 
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
bipin kunal
 
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Linux on RISC-V with Open Source Hardware (Open Source Summit Japan 2020)
Drew Fustini
 
DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端DevOpsDays Taipei 2017 從打鐵到雲端
DevOpsDays Taipei 2017 從打鐵到雲端
Hung-Yen Chen
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures
柏瑀 黃
 
Ceph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community UpdateCeph Day Beijing- Ceph Community Update
Ceph Day Beijing- Ceph Community Update
Danielle Womboldt
 
Deploying PostgreSQL on Kubernetes
Deploying PostgreSQL on KubernetesDeploying PostgreSQL on Kubernetes
Deploying PostgreSQL on Kubernetes
Jimmy Angelakos
 
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
Running OpenEBS on GPDs - Weekly Contributors Meet 28th Sep 2018
OpenEBS
 
OpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
OpenShift Commons Briefing: Ask Me Anything about Cinder and GlanceOpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
OpenShift Commons Briefing: Ask Me Anything about Cinder and Glance
Brian Rosmaita
 
Let's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdwLet's turn your PostgreSQL into columnar store with cstore_fdw
Let's turn your PostgreSQL into columnar store with cstore_fdw
Jan Holčapek
 

Similar to NVIDIA GTC 2018: Enabling GPU-as-a-Service Providers with Red Hat OpenShift (20)

AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World Records
AMD
 
2010-11-08 NSA Technical Symposium
2010-11-08 NSA Technical Symposium2010-11-08 NSA Technical Symposium
2010-11-08 NSA Technical Symposium
Shawn Wells
 
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_PerezRed_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Miguel Pérez Colino
 
Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015
cmilsted
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
Sashikris
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Cloud Native Day Tel Aviv
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY
 
Aleksejs Nemirovskis - Manage your data using oracle BDA
Aleksejs Nemirovskis - Manage your data using oracle BDAAleksejs Nemirovskis - Manage your data using oracle BDA
Aleksejs Nemirovskis - Manage your data using oracle BDA
Andrejs Vorobjovs
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
Bogdan Kyryliuk
 
OpenStack Benelux Conference 2014 | Plenair | RedHat
OpenStack Benelux Conference 2014 | Plenair | RedHatOpenStack Benelux Conference 2014 | Plenair | RedHat
OpenStack Benelux Conference 2014 | Plenair | RedHat
Guston Remie
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Stefano Di Carlo
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
AMD
 
Technical introduction to Red Hat Ansible
Technical introduction to Red Hat AnsibleTechnical introduction to Red Hat Ansible
Technical introduction to Red Hat Ansible
pbtest
 
Poc exadata 2018
Poc exadata 2018Poc exadata 2018
Poc exadata 2018
Jacques Kostic
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
NVIDIA Taiwan
 
Containerized MySQL OpenWorld talk
Containerized MySQL OpenWorld talkContainerized MySQL OpenWorld talk
Containerized MySQL OpenWorld talk
Patrick Galbraith
 
The 2nd half. Scaling to the next^2
The 2nd half. Scaling to the next^2The 2nd half. Scaling to the next^2
The 2nd half. Scaling to the next^2
Haggai Philip Zagury
 
2011-11-03 Intelligence Community Cloud Users Group
2011-11-03 Intelligence Community Cloud Users Group2011-11-03 Intelligence Community Cloud Users Group
2011-11-03 Intelligence Community Cloud Users Group
Shawn Wells
 
Sprint 131
Sprint 131Sprint 131
Sprint 131
ManageIQ
 
AMD EPYC 7002 World Records
AMD EPYC 7002 World RecordsAMD EPYC 7002 World Records
AMD EPYC 7002 World Records
AMD
 
2010-11-08 NSA Technical Symposium
2010-11-08 NSA Technical Symposium2010-11-08 NSA Technical Symposium
2010-11-08 NSA Technical Symposium
Shawn Wells
 
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_PerezRed_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Red_Hat_on_Power-IBM_Systems_Summit_2015-Miguel_Perez
Miguel Pérez Colino
 
Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015Red hat on_power-ibm _lop_day_2015
Red hat on_power-ibm _lop_day_2015
cmilsted
 
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge22by7 and DellEMC Tech Day July 20 2017 - Power Edge
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
Sashikris
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Cloud Native Day Tel Aviv
 
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPCHPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY 2017 | FlyElephant Solutions for Data Science and HPC
HPC DAY
 
Aleksejs Nemirovskis - Manage your data using oracle BDA
Aleksejs Nemirovskis - Manage your data using oracle BDAAleksejs Nemirovskis - Manage your data using oracle BDA
Aleksejs Nemirovskis - Manage your data using oracle BDA
Andrejs Vorobjovs
 
Scalable Clusters On Demand
Scalable Clusters On DemandScalable Clusters On Demand
Scalable Clusters On Demand
Bogdan Kyryliuk
 
OpenStack Benelux Conference 2014 | Plenair | RedHat
OpenStack Benelux Conference 2014 | Plenair | RedHatOpenStack Benelux Conference 2014 | Plenair | RedHat
OpenStack Benelux Conference 2014 | Plenair | RedHat
Guston Remie
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Stefano Di Carlo
 
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoTVEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT at FPL'23_Accelerators for Heterogenous Computing in AIoT
VEDLIoT Project
 
AMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World RecordsAMD EPYC 7002 Launch World Records
AMD EPYC 7002 Launch World Records
AMD
 
Technical introduction to Red Hat Ansible
Technical introduction to Red Hat AnsibleTechnical introduction to Red Hat Ansible
Technical introduction to Red Hat Ansible
pbtest
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
NVIDIA Taiwan
 
Containerized MySQL OpenWorld talk
Containerized MySQL OpenWorld talkContainerized MySQL OpenWorld talk
Containerized MySQL OpenWorld talk
Patrick Galbraith
 
The 2nd half. Scaling to the next^2
The 2nd half. Scaling to the next^2The 2nd half. Scaling to the next^2
The 2nd half. Scaling to the next^2
Haggai Philip Zagury
 
2011-11-03 Intelligence Community Cloud Users Group
2011-11-03 Intelligence Community Cloud Users Group2011-11-03 Intelligence Community Cloud Users Group
2011-11-03 Intelligence Community Cloud Users Group
Shawn Wells
 
Sprint 131
Sprint 131Sprint 131
Sprint 131
ManageIQ
 
Ad

Recently uploaded (20)

Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical DebtBuckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Lynda Kane
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath MaestroDev Dives: Automate and orchestrate your processes with UiPath Maestro
Dev Dives: Automate and orchestrate your processes with UiPath Maestro
UiPathCommunity
 
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
AI EngineHost Review: Revolutionary USA Datacenter-Based Hosting with NVIDIA ...
SOFTTECHHUB
 
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptxSpecial Meetup Edition - TDX Bengaluru Meetup #52.pptx
Special Meetup Edition - TDX Bengaluru Meetup #52.pptx
shyamraj55
 
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc Webinar: Consumer Expectations vs Corporate Realities on Data Broker...
TrustArc
 
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdfComplete Guide to Advanced Logistics Management Software in Riyadh.pdf
Complete Guide to Advanced Logistics Management Software in Riyadh.pdf
Software Company
 
Big Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur MorganBig Data Analytics Quick Research Guide by Arthur Morgan
Big Data Analytics Quick Research Guide by Arthur Morgan
Arthur Morgan
 
Build Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For DevsBuild Your Own Copilot & Agents For Devs
Build Your Own Copilot & Agents For Devs
Brian McKeiver
 
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical DebtBuckeye Dreamin 2024: Assessing and Resolving Technical Debt
Buckeye Dreamin 2024: Assessing and Resolving Technical Debt
Lynda Kane
 
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
#AdminHour presents: Hour of Code2018 slide deck from 12/6/2018
Lynda Kane
 
Semantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AISemantic Cultivators : The Critical Future Role to Enable AI
Semantic Cultivators : The Critical Future Role to Enable AI
artmondano
 
Image processinglab image processing image processing
Image processinglab image processing  image processingImage processinglab image processing  image processing
Image processinglab image processing image processing
RaghadHany
 
Buckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug LogsBuckeye Dreamin' 2023: De-fogging Debug Logs
Buckeye Dreamin' 2023: De-fogging Debug Logs
Lynda Kane
 
Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)Into The Box Conference Keynote Day 1 (ITB2025)
Into The Box Conference Keynote Day 1 (ITB2025)
Ortus Solutions, Corp
 
Automation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From AnywhereAutomation Dreamin': Capture User Feedback From Anywhere
Automation Dreamin': Capture User Feedback From Anywhere
Lynda Kane
 
What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...What is Model Context Protocol(MCP) - The new technology for communication bw...
What is Model Context Protocol(MCP) - The new technology for communication bw...
Vishnu Singh Chundawat
 
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
AI Changes Everything – Talk at Cardiff Metropolitan University, 29th April 2...
Alan Dix
 
Rusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond SparkRusty Waters: Elevating Lakehouses Beyond Spark
Rusty Waters: Elevating Lakehouses Beyond Spark
carlyakerly1
 
Datastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptxDatastucture-Unit 4-Linked List Presentation.pptx
Datastucture-Unit 4-Linked List Presentation.pptx
kaleeswaric3
 
How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?How Can I use the AI Hype in my Business Context?
How Can I use the AI Hype in my Business Context?
Daniel Lehner
 
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdfThe Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
The Evolution of Meme Coins A New Era for Digital Currency ppt.pdf
Abi john
 
Ad

NVIDIA GTC 2018: Enabling GPU-as-a-Service Providers with Red Hat OpenShift

  • 1. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1 Enabling GPU-as-a-Service Providers with Red Hat OpenShift @jeremyeder Senior Principal Software Engineer, Red Hat March, 2018
  • 2. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Agenda ● OpenShift Cluster Overview ● Infrastructure Abstraction ● High Performance Features ● GPU Overview 2
  • 3. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Community Powered Innovation 3
  • 4. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING What does an OpenShift Cluster look like? SERVICE LAYER ROUTING LAYER PERSISTENT STORAGE REGISTRY RHEL NODE C C RHEL NODE C C RHEL NODE c C C RHEL NODE C C RHEL NODE C RHEL NODE C RED HAT ENTERPRISE LINUX MASTER API/AUTHENTICATION DATA STORE SCHEDULER HEALTH/SCALING PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID 4
  • 5. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Abstract away any infrastructure SERVICE LAYER ROUTING LAYER PHYSICAL VIRTUAL PRIVATE PUBLIC HYBRID ● Bare Metal ● RHV ● OpenStack ● VMware ● GCE ● Azure ● AWS ● BYO nodes... 5
  • 6. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 6 One Platform to... OpenShift is the single platform to run any application: ● Old or new ● Monolithic/Microservice Big Data NFV FSI Animation ISVsHPC Machine Learning 6
  • 7. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 7 High Performance RFEs by Vertical Feature FSI NFV ISV BD/ML ANIM HPC NUMA (cpuset.cpus and cpuset.mems) Yes Yes Yes Maybe Maybe Yes Device Passthrough (NIC/Disk/GPU etc...) Yes Yes Yes Maybe Maybe Yes sysctl Support (non-namespaced too) Yes Yes Yes Yes Yes Yes Separation of control- and data-plane Yes Yes Yes Yes Yes Yes Node “fitness” (extended health info) Yes Yes Maybe Maybe Maybe Yes Multi-homed pods Yes Yes Maybe Yes Yes Yes Kernel Modules (DKMS-ish) Yes Yes Maybe Maybe Yes Maybe Hugepages Yes Yes Yes Yes Maybe Maybe 7
  • 8. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Enable containerization of Infrastructure Software ● Software-defined Storage and Networking ● Packet switching and routing tiers ● Multi-workloads (very different) within a single cluster ○ Layered schedulers (HPC/grid) ● Many more... Why do this? 8
  • 9. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING ● Gluster/Container Native Storage ● Ceph ● OpenStack ● rad analytics ● KubeVirt Enable containerization of Red Hat’s products 9
  • 10. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING ● Resource Management Working Group ○ Features Delivered ■ Device Plugins (GPU/Bypass/FPGA) ■ CPU Manager (exclusive cores) ■ Huge Pages Support ○ Extensive Roadmap ● Intel, IBM, Google, NVIDIA, Red Hat, many more... Upstream First: Kubernetes Working Groups 10
  • 11. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING ● Network Plumbing Working Group ○ Formalized Dec 2017 ● Goal is to implement an out of tree, pseudo-standard collection of CRDs for multiple networks, owned by sig-network, *out of tree* ● Separate control- and data-plane, Overlapping IPs, Fast Data-plane ● IBM, Intel, Red Hat, Huawei, Cisco, Tigera...at least. Upstream First: Kubernetes Working Groups 11
  • 12. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING GPU CLUSTER TOPOLOGY 12
  • 13. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Control Plane Compute Nodes and Storage Tier Infrastructure master and etcd master and etcd master and etcd registry and router registry and router LB registry and router OpenShift Cluster Topology 13
  • 14. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Compute Nodes... ● How to enable software to take advantage of “special” hardware ● Create Node Pools ○ Mark them as “special” ○ Taints/Tolerations ○ ExtendedResourceTole ration OpenShift Cluster Topology 14
  • 15. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Compute Nodes... ● How to enable software to take advantage of “special” hardware ● Tune/Configure the OS ○ Tuned Profiles ○ CPU Isolation ○ sysctls OpenShift Cluster Topology 15
  • 16. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Unsafe ● Experimental Kubelet Flag ● kernel.sem* ● kernel.shm* ● kernel.msg* ● fs.mqueue.* ● net.* In OpenShift, there are three “types” of sysctls Safe ● Enabled by default ● kernel.shm_rmid_forced ● net.ipv4.ip_local_port_range ● net.ipv4.tcp_syncookies Node-level ● Can’t set from a pod ● Potentially affects other pods ● Many interesting sysctls ● Use TuneD 16 OpenShift Cluster Topology
  • 17. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Compute Nodes... ● How to enable software to take advantage of “special” hardware ● Optimize your workload ○ Dedicate CPU cores ○ Consume hugepages OpenShift Cluster Topology 17
  • 18. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Compute Nodes... ● How to enable software to take advantage of “special” hardware ● Enable the Hardware ○ Install drivers ○ Deploy Device Plugin OpenShift Cluster Topology 18
  • 19. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Compute Nodes... ● How to enable software to take advantage of “special” hardware ● Consume the Device ○ KubeFlow Template deployment OpenShift Cluster Topology 19
  • 20. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Kubernetes Deployment for STAC-A2 ● All-in-One Kubernetes Installation ● (hack/local-up-cluster.sh) ● Node labeled ● Containers: ○ RHEL7+CUDA9 ○ RHEL7+CUDA9+DEVICE-PLUGIN ○ RHEL7+CUDA9+STAC-A2 ● CUDA 9 ● 8 x NVIDIA Tesla V100 (Volta) GPUs ● HPE Apollo 6500 w/XL270d Gen9 ● Red Hat Enterprise Linux 7.4 ● Kubernetes 1.8 (setup info) ● nvidia-smi --applications-clocks=877,1380 ● https://rhelblog.redhat.com/2017/11/21/red-hat-and-partners-deliver-new-perf ormance-records-on-prominent-risk-analytics-benchmark/ ● https://news.developer.nvidia.com/a-new-stac-a2-record/ 20
  • 21. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING 21 Kubernetes Deployment for STAC-A2 Volta GPU Kubelet Device Plugin (daemonset) Kube Scheduler Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Benchmark (pod) resources: limits: nvidia.com/gpu: 8 kubectl create 21
  • 22. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Benchmark (pod) resources: limits: nvidia.com/gpu: 8 22 Kubernetes Deployment for STAC-A2 Volta GPU Kubelet Device Plugin (daemonset) Kube Scheduler Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU Volta GPU kubectl create 22
  • 23. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING ● Early KubeFlow involvement ● radanalytics templates for ML-workflow on OpenShift ● Machine-Learning OpenShift Commons ● Demo Repositories ○ https://github.com/zvonkok/nvidia-k8s ○ https://github.com/redhat-performance/openshift-psap Recent GPU-related work on OpenShift 23
  • 24. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews 24
  • 25. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Commoditizing GPU-as-a-Service Providers with Red Hat OpenShift Tuesday, Mar 27, 1:00 PM - 1:25 PM, Room 210E Red Hat OpenShift Container Platform, with Kubernetes at it's core, can play an important role in building flexible hybrid cloud infrastructure. By abstracting infrastructure away from developers, workloads become portable across any cloud. With NVIDIA Volta GPUs now available in every public cloud [1], as well as from every computer maker, an abstraction library like OpenShift becomes even more valuable. Through demonstrations, this session will introduce you to declarative models for consuming GPUs via OpenShift, as well as the two-level scheduling decisions that provide fast placement and stability. 25