SlideShare a Scribd company logo
Ray: A Distributed Execution Framework for
Emerging AI Applications
Presenters: Philipp Moritz, Robert Nishihara
Spark Summit West
June 6, 2017
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with Philipp Moritz and Robert Nishihara
Why build a new system?
Supervised Learning
Model
“CAT”
Data point Label
Emerging AI Applications
Emerging AI Applications
Emerging AI Applications
Emerging AI Applications
Emerging AI Applications
Supervised Learning → Reinforcement Learning
Supervised Learning → Reinforcement Learning
● One prediction ● Sequences of actions→
Supervised Learning → Reinforcement Learning
● One prediction
● Static environments
● Sequences of actions
● Dynamic environments→
→
Supervised Learning → Reinforcement Learning
● One prediction
● Static environments
● Immediate feedback
● Sequences of actions
● Dynamic environments
● Delayed rewards→
→
→
Process inputs from different sensors in parallel & real-time
RL Application Pattern
Process inputs from different sensors in parallel & real-time
Execute large number of simulations, e.g., up to 100s of millions
RL Application Pattern
Process inputs from different sensors in parallel & real-time
Execute large number of simulations, e.g., up to 100s of millions
Rollouts outcomes are used to update policy (e.g., SGD)
Update
policy
Update
policy
…
…
Update
policy
simulations
Update
policy
…
RL Application Pattern
…
Update
policy
Update
policy
…
Update
policy
rollout
Update
policy
…
Process inputs from different sensors in parallel & real-time
Execute large number of simulations, e.g., up to 100s of millions
Rollouts outcomes are used to update policy (e.g., SGD)
RL Application Pattern
Process inputs from different sensors in parallel & real-time
Execute large number of simulations, e.g., up to 100s of millions
Rollouts outcomes are used to update policy (e.g., SGD)
Often policies implemented by DNNs
actions
observations
RL Application Pattern
Process inputs from different sensors in parallel & real-time
Execute large number of simulations, e.g., up to 100s of millions
Rollouts outcomes are used to update policy (e.g., SGD)
Often policies implemented by DNNs
Most RL algorithms developed in Python
RL Application Pattern
RL Application Requirements
Need to handle dynamic task graphs, where tasks have
• Heterogeneous durations
• Heterogeneous computations
Schedule millions of tasks/sec
Make it easy to parallelize ML algorithms written in Python
Ray API - remote functions
def zeros(shape):
return np.zeros(shape)
def dot(a, b):
return np.dot(a, b)
Ray API - remote functions
@ray.remote
def zeros(shape):
return np.zeros(shape)
@ray.remote
def dot(a, b):
return np.dot(a, b)
Ray API - remote functions
@ray.remote
def zeros(shape):
return np.zeros(shape)
@ray.remote
def dot(a, b):
return np.dot(a, b)
id1 = zeros.remote([5, 5])
id2 = zeros.remote([5, 5])
id3 = dot.remote(id1, id2)
ray.get(id3)
Ray API - remote functions
@ray.remote
def zeros(shape):
return np.zeros(shape)
@ray.remote
def dot(a, b):
return np.dot(a, b)
id1 = zeros.remote([5, 5])
id2 = zeros.remote([5, 5])
id3 = dot.remote(id1, id2)
ray.get(id3)
● Blue variables are Object IDs.
Ray API - remote functions
@ray.remote
def zeros(shape):
return np.zeros(shape)
@ray.remote
def dot(a, b):
return np.dot(a, b)
id1 = zeros.remote([5, 5])
id2 = zeros.remote([5, 5])
id3 = dot.remote(id1, id2)
ray.get(id3)
● Blue variables are Object IDs.
id1 id2
id3
zeros zeros
dot
Ray API - actors
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter()
c.inc() # This returns 1
c.inc() # This returns 2
c.inc() # This returns 3
Ray API - actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id1 = c.inc.remote()
id2 = c.inc.remote()
id3 = c.inc.remote()
ray.get([id1, id2, id3]) # This returns [1, 2, 3]
● State is shared between actor methods.
● Actor methods return Object IDs.
Ray API - actors
@ray.remote
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id1 = c.inc.remote()
id2 = c.inc.remote()
id3 = c.inc.remote()
ray.get([id1, id2, id3]) # This returns [1, 2, 3]
● State is shared between actor methods.
● Actor methods return Object IDs.
id1inc
Counter
inc
inc
id2
id3
Ray API - actors
@ray.remote(num_gpus=1)
class Counter(object):
def __init__(self):
self.value = 0
def inc(self):
self.value += 1
return self.value
c = Counter.remote()
id1 = c.inc.remote()
id2 = c.inc.remote()
id3 = c.inc.remote()
ray.get([id1, id2, id3]) # This returns [1, 2, 3]
● State is shared between actor methods.
● Actor methods return Object IDs.
● Can specify GPU requirements
id1inc
Counter
inc
inc
id2
id3
Ray architecture
Node 1 Node 2 Node 3
Ray architecture
Node 1 Node 2 Node 3
Driver
Ray architecture
Node 1 Node 2 Node 3
Worker WorkerWorker WorkerWorkerDriver
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Global Control Store
Global Control Store
Global Control Store
Global Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Global Control Store
Global Control Store
Global Control Store
Ray architecture
WorkerDriver WorkerWorker WorkerWorker
Object Store Object Store Object Store
Local Scheduler Local Scheduler Local Scheduler
Global Control Store
Global Control Store
Global Control Store
Debugging Tools
Profiling Tools
Web UI
Global Scheduler
Global Scheduler
Global Scheduler
Global Scheduler
Ray performance
Ray performance
One million tasks per
second
Ray performance
Latency of local task execution: ~300 us
Latency of remote task execution: ~1ms
One million tasks per
second
Ray fault tolerance
Ray fault tolerance
Reconstruction
Ray fault tolerance
Reconstruction
Reconstruction
Ray: A Cluster Computing Engine for Reinforcement Learning Applications with Philipp Moritz and Robert Nishihara
Evolution Strategies
actions
observations
rewards
PolicySimulator
Try lots of different policies and see which one works best!
Pseudocode
actions
observations
rewards
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
Pseudocode
actions
observations
rewards
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker() for i in range(20)]
policy = initial_policy()
Pseudocode
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, rewards, seeds)
actions
observations
rewards
Pseudocode
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, rewards, seeds)
actions
observations
rewards
Pseudocode
@ray.remote
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, rewards, seeds)
actions
observations
rewards
Pseudocode
@ray.remote
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker.remote() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, rewards, seeds)
actions
observations
rewards
Pseudocode
@ray.remote
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker.remote() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation.remote(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, rewards, seeds)
actions
observations
rewards
Pseudocode
@ray.remote
class Worker(object):
def do_simulation(policy, seed):
# perform simulation and return reward
workers = [Worker.remote() for i in range(20)]
policy = initial_policy()
for i in range(200):
seeds = generate_seeds(i)
rewards = [workers[j].do_simulation.remote(policy, seeds[j])
for j in range(20)]
policy = compute_update(policy, ray.get(rewards), seeds)
actions
observations
rewards
Evolution strategies on Ray
10 nodes 20 nodes 30 nodes 40 nodes 50 nodes
Reference 97K 215K 202K N/A N/A
Ray 152K 285K 323K 476K 571K
The Ray implementation takes half the amount of
code and was implemented in a couple of hours
Simulator steps per second:
Policy Gradients
Ray + Apache Spark
● Complementary
○ Spark handles data processing, “classic” ML algorithms
○ Ray handles emerging AI algos., e.g. reinforcement learning (RL)
● Interoperability through object store based on Apache Arrow
○ Common data layout
○ Supports multiple languages
Ray is a system for AI Applications
● Ray is open source! https://github.com/ray-project/ray
● We have a v0.1 release!
pip install ray
● We’d love your feedback
Philipp
Ion
Alexey
Stephanie
Johann Richard
William Mehrdad
Mike
Robert

More Related Content

What's hot (20)

PDF
Embedded system design: a modern approach to the electronic design.
Massimo Talia
 
PPT
Design of embedded systems
Pradeep Kumar TS
 
PDF
GPU - An Introduction
Dhan V Sagar
 
PPT
Amd vs intel
BESOR ACADEMY
 
PDF
Enabling POWER 8 advanced features on Linux
Sebastien Chabrolles
 
PPTX
Cache memory
Zalal Udeen
 
PPTX
EE8691 – EMBEDDED SYSTEMS.pptx
RockFellerSinghRusse
 
PPT
The CPU and Memory and Major Components
imtiazalijoono
 
PPTX
Multithreading computer architecture
Haris456
 
PPTX
CPU vs GPU Comparison
jeetendra mandal
 
PPTX
Processors
HIMANSHU JAIN
 
PPTX
General Purpose Input Output - Brief Introduction
NEEVEE Technologies
 
PPTX
Multi core processors
Nipun Sharma
 
PPTX
Intel I3,I5,I7 Processor
sagar solanky
 
PPTX
EvolucióN%20de%20los%20 Microprocesadores[1]
guestbdbf5f
 
PDF
NVIDIA Keynote #GTC21
Alison B. Lowndes
 
PPTX
Nvidia (History, GPU Architecture and New Pascal Architecture)
Saksham Tanwar
 
PPTX
AI in production
Peter Schleinitz
 
PPSX
Evolution Of Microprocessors
harinder
 
PPT
03 top level view of computer function and interconnection.ppt.enc
Anwal Mirza
 
Embedded system design: a modern approach to the electronic design.
Massimo Talia
 
Design of embedded systems
Pradeep Kumar TS
 
GPU - An Introduction
Dhan V Sagar
 
Amd vs intel
BESOR ACADEMY
 
Enabling POWER 8 advanced features on Linux
Sebastien Chabrolles
 
Cache memory
Zalal Udeen
 
EE8691 – EMBEDDED SYSTEMS.pptx
RockFellerSinghRusse
 
The CPU and Memory and Major Components
imtiazalijoono
 
Multithreading computer architecture
Haris456
 
CPU vs GPU Comparison
jeetendra mandal
 
Processors
HIMANSHU JAIN
 
General Purpose Input Output - Brief Introduction
NEEVEE Technologies
 
Multi core processors
Nipun Sharma
 
Intel I3,I5,I7 Processor
sagar solanky
 
EvolucióN%20de%20los%20 Microprocesadores[1]
guestbdbf5f
 
NVIDIA Keynote #GTC21
Alison B. Lowndes
 
Nvidia (History, GPU Architecture and New Pascal Architecture)
Saksham Tanwar
 
AI in production
Peter Schleinitz
 
Evolution Of Microprocessors
harinder
 
03 top level view of computer function and interconnection.ppt.enc
Anwal Mirza
 

Similar to Ray: A Cluster Computing Engine for Reinforcement Learning Applications with Philipp Moritz and Robert Nishihara (20)

PDF
AI/ML Infra Meetup | The power of Ray in the era of LLM and multi-modality AI
Alluxio, Inc.
 
PDF
Building an ML Platform with Ray and MLflow
Databricks
 
PDF
ACM Sunnyvale Meetup.pdf
Anyscale
 
PDF
Ray and Its Growing Ecosystem
Databricks
 
PDF
The Future of Computing is Distributed
Alluxio, Inc.
 
PDF
Highly-scalable Reinforcement Learning RLlib for Real-world Applications
Bill Liu
 
PDF
Distributed computing with Ray. Find your hyper-parameters, speed up your Pan...
Jan Margeta
 
PDF
Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...
Databricks
 
PDF
Distributed computing and hyper-parameter tuning with Ray
Jan Margeta
 
PPTX
An Introduction to TensorFlow architecture
Mani Goswami
 
PDF
Learning Ray, 5th Early Release Max Pumperla
gjslndtloto
 
PDF
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Patrick Diehl
 
PDF
Machine learning at Scale with Apache Spark
Martin Zapletal
 
PPTX
Parallel Linear Regression in Interative Reduce and YARN
DataWorks Summit
 
PPTX
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Josh Patterson
 
PDF
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
PDF
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
Chester Chen
 
PPTX
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
PPTX
NYAI - Scaling Machine Learning Applications by Braxton McKee
Rizwan Habib
 
PDF
Ray The alternative to distributed frameworks.pdf
Andrew Li
 
AI/ML Infra Meetup | The power of Ray in the era of LLM and multi-modality AI
Alluxio, Inc.
 
Building an ML Platform with Ray and MLflow
Databricks
 
ACM Sunnyvale Meetup.pdf
Anyscale
 
Ray and Its Growing Ecosystem
Databricks
 
The Future of Computing is Distributed
Alluxio, Inc.
 
Highly-scalable Reinforcement Learning RLlib for Real-world Applications
Bill Liu
 
Distributed computing with Ray. Find your hyper-parameters, speed up your Pan...
Jan Margeta
 
Enabling Composition in Distributed Reinforcement Learning with Ray RLlib wit...
Databricks
 
Distributed computing and hyper-parameter tuning with Ray
Jan Margeta
 
An Introduction to TensorFlow architecture
Mani Goswami
 
Learning Ray, 5th Early Release Max Pumperla
gjslndtloto
 
Framework for Extensible, Asynchronous Task Scheduling (FEATS) in Fortran
Patrick Diehl
 
Machine learning at Scale with Apache Spark
Martin Zapletal
 
Parallel Linear Regression in Interative Reduce and YARN
DataWorks Summit
 
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Josh Patterson
 
Large volume data analysis on the Typesafe Reactive Platform
Martin Zapletal
 
SF Big Analytics talk: NVIDIA FLARE: Federated Learning Application Runtime E...
Chester Chen
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
MLconf
 
NYAI - Scaling Machine Learning Applications by Braxton McKee
Rizwan Habib
 
Ray The alternative to distributed frameworks.pdf
Andrew Li
 
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
Databricks
 
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
PPT
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 2
Databricks
 
PPTX
Data Lakehouse Symposium | Day 4
Databricks
 
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
PDF
Democratizing Data Quality Through a Centralized Platform
Databricks
 
PDF
Learn to Use Databricks for Data Science
Databricks
 
PDF
Why APM Is Not the Same As ML Monitoring
Databricks
 
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
PDF
Sawtooth Windows for Feature Aggregations
Databricks
 
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
PDF
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
PDF
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
PDF
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
PPTX
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PPT
deep dive data management sharepoint apps.ppt
novaprofk
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPTX
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
PPTX
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PPTX
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
PPTX
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
PDF
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
PPTX
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
PDF
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
OOPs with Java_unit2.pdf. sarthak bookkk
Sarthak964187
 
apidays Helsinki & North 2025 - Vero APIs - Experiences of API development in...
apidays
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
deep dive data management sharepoint apps.ppt
novaprofk
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
apidays Singapore 2025 - Designing for Change, Julie Schiller (Google)
apidays
 
apidays Helsinki & North 2025 - APIs at Scale: Designing for Alignment, Trust...
apidays
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
Product Management in HealthTech (Case Studies from SnappDoctor)
Hamed Shams
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
The _Operations_on_Functions_Addition subtruction Multiplication and Division...
mdregaspi24
 
ER_Model_Relationship_in_DBMS_Presentation.pptx
dharaadhvaryu1992
 
apidays Helsinki & North 2025 - APIs in the healthcare sector: hospitals inte...
apidays
 
apidays Helsinki & North 2025 - Running a Successful API Program: Best Practi...
apidays
 
Early_Diabetes_Detection_using_Machine_L.pdf
maria879693
 

Ray: A Cluster Computing Engine for Reinforcement Learning Applications with Philipp Moritz and Robert Nishihara

  • 1. Ray: A Distributed Execution Framework for Emerging AI Applications Presenters: Philipp Moritz, Robert Nishihara Spark Summit West June 6, 2017
  • 3. Why build a new system?
  • 10. Supervised Learning → Reinforcement Learning
  • 11. Supervised Learning → Reinforcement Learning ● One prediction ● Sequences of actions→
  • 12. Supervised Learning → Reinforcement Learning ● One prediction ● Static environments ● Sequences of actions ● Dynamic environments→ →
  • 13. Supervised Learning → Reinforcement Learning ● One prediction ● Static environments ● Immediate feedback ● Sequences of actions ● Dynamic environments ● Delayed rewards→ → →
  • 14. Process inputs from different sensors in parallel & real-time RL Application Pattern
  • 15. Process inputs from different sensors in parallel & real-time Execute large number of simulations, e.g., up to 100s of millions RL Application Pattern
  • 16. Process inputs from different sensors in parallel & real-time Execute large number of simulations, e.g., up to 100s of millions Rollouts outcomes are used to update policy (e.g., SGD) Update policy Update policy … … Update policy simulations Update policy … RL Application Pattern
  • 17. … Update policy Update policy … Update policy rollout Update policy … Process inputs from different sensors in parallel & real-time Execute large number of simulations, e.g., up to 100s of millions Rollouts outcomes are used to update policy (e.g., SGD) RL Application Pattern
  • 18. Process inputs from different sensors in parallel & real-time Execute large number of simulations, e.g., up to 100s of millions Rollouts outcomes are used to update policy (e.g., SGD) Often policies implemented by DNNs actions observations RL Application Pattern
  • 19. Process inputs from different sensors in parallel & real-time Execute large number of simulations, e.g., up to 100s of millions Rollouts outcomes are used to update policy (e.g., SGD) Often policies implemented by DNNs Most RL algorithms developed in Python RL Application Pattern
  • 20. RL Application Requirements Need to handle dynamic task graphs, where tasks have • Heterogeneous durations • Heterogeneous computations Schedule millions of tasks/sec Make it easy to parallelize ML algorithms written in Python
  • 21. Ray API - remote functions def zeros(shape): return np.zeros(shape) def dot(a, b): return np.dot(a, b)
  • 22. Ray API - remote functions @ray.remote def zeros(shape): return np.zeros(shape) @ray.remote def dot(a, b): return np.dot(a, b)
  • 23. Ray API - remote functions @ray.remote def zeros(shape): return np.zeros(shape) @ray.remote def dot(a, b): return np.dot(a, b) id1 = zeros.remote([5, 5]) id2 = zeros.remote([5, 5]) id3 = dot.remote(id1, id2) ray.get(id3)
  • 24. Ray API - remote functions @ray.remote def zeros(shape): return np.zeros(shape) @ray.remote def dot(a, b): return np.dot(a, b) id1 = zeros.remote([5, 5]) id2 = zeros.remote([5, 5]) id3 = dot.remote(id1, id2) ray.get(id3) ● Blue variables are Object IDs.
  • 25. Ray API - remote functions @ray.remote def zeros(shape): return np.zeros(shape) @ray.remote def dot(a, b): return np.dot(a, b) id1 = zeros.remote([5, 5]) id2 = zeros.remote([5, 5]) id3 = dot.remote(id1, id2) ray.get(id3) ● Blue variables are Object IDs. id1 id2 id3 zeros zeros dot
  • 26. Ray API - actors class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter() c.inc() # This returns 1 c.inc() # This returns 2 c.inc() # This returns 3
  • 27. Ray API - actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id1 = c.inc.remote() id2 = c.inc.remote() id3 = c.inc.remote() ray.get([id1, id2, id3]) # This returns [1, 2, 3] ● State is shared between actor methods. ● Actor methods return Object IDs.
  • 28. Ray API - actors @ray.remote class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id1 = c.inc.remote() id2 = c.inc.remote() id3 = c.inc.remote() ray.get([id1, id2, id3]) # This returns [1, 2, 3] ● State is shared between actor methods. ● Actor methods return Object IDs. id1inc Counter inc inc id2 id3
  • 29. Ray API - actors @ray.remote(num_gpus=1) class Counter(object): def __init__(self): self.value = 0 def inc(self): self.value += 1 return self.value c = Counter.remote() id1 = c.inc.remote() id2 = c.inc.remote() id3 = c.inc.remote() ray.get([id1, id2, id3]) # This returns [1, 2, 3] ● State is shared between actor methods. ● Actor methods return Object IDs. ● Can specify GPU requirements id1inc Counter inc inc id2 id3
  • 30. Ray architecture Node 1 Node 2 Node 3
  • 31. Ray architecture Node 1 Node 2 Node 3 Driver
  • 32. Ray architecture Node 1 Node 2 Node 3 Worker WorkerWorker WorkerWorkerDriver
  • 33. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store
  • 34. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler
  • 35. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler
  • 36. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler
  • 37. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler Global Scheduler Global Scheduler Global Scheduler Global Scheduler
  • 38. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler Global Scheduler Global Scheduler Global Scheduler Global Scheduler
  • 39. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler Global Control Store Global Control Store Global Control Store Global Scheduler Global Scheduler Global Scheduler Global Scheduler
  • 40. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler Global Scheduler Global Scheduler Global Scheduler Global Scheduler Global Control Store Global Control Store Global Control Store
  • 41. Ray architecture WorkerDriver WorkerWorker WorkerWorker Object Store Object Store Object Store Local Scheduler Local Scheduler Local Scheduler Global Control Store Global Control Store Global Control Store Debugging Tools Profiling Tools Web UI Global Scheduler Global Scheduler Global Scheduler Global Scheduler
  • 43. Ray performance One million tasks per second
  • 44. Ray performance Latency of local task execution: ~300 us Latency of remote task execution: ~1ms One million tasks per second
  • 49. Evolution Strategies actions observations rewards PolicySimulator Try lots of different policies and see which one works best!
  • 51. Pseudocode actions observations rewards class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker() for i in range(20)] policy = initial_policy()
  • 52. Pseudocode class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, rewards, seeds) actions observations rewards
  • 53. Pseudocode class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, rewards, seeds) actions observations rewards
  • 54. Pseudocode @ray.remote class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, rewards, seeds) actions observations rewards
  • 55. Pseudocode @ray.remote class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker.remote() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, rewards, seeds) actions observations rewards
  • 56. Pseudocode @ray.remote class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker.remote() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation.remote(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, rewards, seeds) actions observations rewards
  • 57. Pseudocode @ray.remote class Worker(object): def do_simulation(policy, seed): # perform simulation and return reward workers = [Worker.remote() for i in range(20)] policy = initial_policy() for i in range(200): seeds = generate_seeds(i) rewards = [workers[j].do_simulation.remote(policy, seeds[j]) for j in range(20)] policy = compute_update(policy, ray.get(rewards), seeds) actions observations rewards
  • 58. Evolution strategies on Ray 10 nodes 20 nodes 30 nodes 40 nodes 50 nodes Reference 97K 215K 202K N/A N/A Ray 152K 285K 323K 476K 571K The Ray implementation takes half the amount of code and was implemented in a couple of hours Simulator steps per second:
  • 60. Ray + Apache Spark ● Complementary ○ Spark handles data processing, “classic” ML algorithms ○ Ray handles emerging AI algos., e.g. reinforcement learning (RL) ● Interoperability through object store based on Apache Arrow ○ Common data layout ○ Supports multiple languages
  • 61. Ray is a system for AI Applications ● Ray is open source! https://github.com/ray-project/ray ● We have a v0.1 release! pip install ray ● We’d love your feedback Philipp Ion Alexey Stephanie Johann Richard William Mehrdad Mike Robert