5aaa

The document outlines the specifications for training an A3C agent to play Atari Pong, utilizing a hybrid of synchronous and asynchronous methods with 16 workers across 8 environments. It discusses the benefits of parallelization, including faster training and more diverse data, and highlights the importance of choosing the right method based on implementation ease, compute cost, and scale. Additionally, it emphasizes that not all problems require parallelization, particularly for off-policy algorithms like DQN that can perform well without it.

Uploaded by

Monday ABDOUL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views5 pages

5aaa

Uploaded by

Monday ABDOUL

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

5 "agent": [{

6 "name": "A3C",
7 "algorithm": {
8 "name": "ActorCritic",
9 "action_pdtype": "default",
10 "action_policy": "default",
11 "explore_var_spec": null,
12 "gamma": 0.99,
13 "lam": null,
14 "num_step_returns": 5,
15 "entropy_coef_spec": {
16 "name": "no_decay",
17 "start_val": 0.01,
18 "end_val": 0.01,
19 "start_step": 0,
20 "end_step": 0
21 },
22 "val_loss_coef": 0.5,
23 "training_frequency": 5
24 },
25 "memory": {
26 "name": "OnPolicyBatchReplay",
27 },
28 "net": {
29 "type": "ConvNet",
30 "shared": true,
31 "conv_hid_layers": [
32 [32, 8, 4, 0, 1],
33 [64, 4, 2, 0, 1],
34 [32, 3, 1, 0, 1]
35 ],
36 "fc_hid_layers": [512],
37 "hid_layers_activation": "relu",
38 "init_fn": "orthogonal_",
39 "normalize": true,
40 "batch_norm": false,
41 "clip_grad_val": 0.5,
42 "use_same_optim": false,
43 "loss_spec": {

WOW! eBook
www.wowebook.org
44 "name": "MSELoss"
45 },
46 "actor_optim_spec": {
47 "name": "GlobalAdam",
48 "lr": 1e-4
49 },
50 "critic_optim_spec": {
51 "name": "GlobalAdam",
52 "lr": 1e-4
53 },
54 "lr_scheduler_spec": null,
55 "gpu": false
56 }
57 }],
58 "env": [{
59 "name": "PongNoFrameskip-v4",
60 "frame_op": "concat",
61 "frame_op_len": 4,
62 "reward_scale": "sign",
63 "num_envs": 8,
64 "max_t": null,
65 "max_frame": 1e7
66 }],
67 "body": {
68 "product": "outer",
69 "num": 1
70 },
71 "meta": {
72 "distributed": "synced",
73 "log_frequency": 10000,
74 "eval_frequency": 10000,
75 "max_session": 16,
76 "max_trial": 1,
77 }
78 }
79 }

Code 8.2: A3C spec file to play Atari Pong.

WOW! eBook
www.wowebook.org
In Code 8.2 the meta spec "distributed": "synced"
(line:72), and specify the number of workers as max_session
to 16 (line:75). The optimizer is changed to variant
GlobalAdam (line:47) that is more suitable for Hogwild! We
also change the number of environments num_envs to 8
(line:8). Note that if the number of environments is greater than
1, the algorithm becomes a hybrid of synchronous (vector
environments) and asynchronous (Hogwild!) methods, and
there will be num_envs × max_session workers.
Conceptually, this can be thought of as a hierarchy of Hogwild!
workers which each spawn a number of synchronous workers.
To train this A3C agent with n-step returns using SLM Lab, run
the commands shown in Code 8.3 in a terminal.

1 conda activate lab

2 python run_lab.py
slm_lab/spec/benchmark/a3c/a3c_nstep_pong.json
a3c_nstep_pong
↪ train

Code 8.3: A3C: training an agent

As usual, this will run a training Trial to produce the graphs

shown in Figure 8.1. However, note that now the sessions take
on the role of asynchronous workers. The trial should take only
a few hours to complete when running on CPUs, although it will
require a machine with at least 16 CPUs.

WOW! eBook
www.wowebook.org
Figure 8.1: The trial graphs of A3C (n-step returns)
with 16 workers. Since sessions take on the role of
workers, the horizontal axis measures the number
of frames experienced by individual worker.
Therefore, the total number of frames experienced
collectively is equal to the sum of the individual
frames which will add to 10 million total frames.

8.4 SUMMARY
In this chapter, we discussed two widely applicable

WOW! eBook
www.wowebook.org
parallelization methods — synchronous and asynchronous.
Respectively, we showed that they can be implemented using
vector environments and the Hogwild! algorithm.
The two benefits of parallelization are faster training and more
diverse data. The second benefit plays a crucial role in
stabilizing and improving the training of policy gradient
algorithms. In fact, it often makes the difference between
success and failure.
When determining which of the parallelization methods to
apply, it helps to consider the following factors — ease of
implementation, compute cost, scale.
Synchronous methods (e.g. vector environments) are often
straightforward and easier to implement than asynchronous
methods, particularly if only data gathering is parallelized. Data
generation is usually cheaper, so they require fewer resources
for the same number of frames and so scale better up to a
moderate number of workers, e.g. less than 100. However, the
synchronization barrier becomes a bottleneck when applied at a
larger scale. In this case, asynchronous methods will likely be
significantly faster.
It is not always necessary to parallelize. As a general rule, try to
understand if a problem is simple enough to be solved without
parallelization before investing time and resources to
implement it. Additionally, the need to parallelize depends on
the algorithm used. Off-policy algorithms such as DQN can
often achieve very strong performance without parallelization
since the experience replay already provides diverse training
data. Even if training takes a very long time, agents can still

WOW! eBook
www.wowebook.org

Essential n8n Playbook
From Everand
Essential n8n Playbook
Leandro Calado
No ratings yet
120 Advanced JavaScript Interview Questions
From Everand
120 Advanced JavaScript Interview Questions
Hernando Abella
No ratings yet
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Practical C++ Backend Programming
From Everand
Practical C++ Backend Programming
Justin Barbara
No ratings yet
Accelerated Computing with HIP
From Everand
Accelerated Computing with HIP
Yifan Sun
4.5/5 (2)
Node.js 63 Interview Questions and Answers
From Everand
Node.js 63 Interview Questions and Answers
John Edward Cooper Berg
No ratings yet
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
From Everand
Python Advanced Programming: The Guide to Learn Python Programming. Reference with Exercises and Samples About Dynamical Programming, Multithreading, Multiprocessing, Debugging, Testing and More
Marcus Richards
No ratings yet
50 Recipes for Programming Node.js
From Everand
50 Recipes for Programming Node.js
Jamie Munro
3/5 (4)
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
From Everand
Rust Package 100 Knocks: One-Hour Mastery Series 2024 Edition
Kanto
No ratings yet
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
From Everand
JavaScript Patterns JumpStart Guide (Clean up your JavaScript Code)
Dan Wahlin
4.5/5 (3)
Production System: Fundamentals and Applications
From Everand
Production System: Fundamentals and Applications
Fouad Sabry
No ratings yet
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
From Everand
JAVA PROGRAMMING FOR BEGINNERS: Master Java Fundamentals and Build Your Own Applications (2023 Crash Course)
Theo Houle
No ratings yet
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
From Everand
PHP Package Mastery: 100 Essential Tools in One Hour - 2024 Edition
Kanto
No ratings yet
Inspiring Powershell Articles
From Everand
Inspiring Powershell Articles
Murat Yildirimoglu
No ratings yet
50 Java Concepts Every Developer Should Know
From Everand
50 Java Concepts Every Developer Should Know
Hernando Abella
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
Google JAX Essentials: A quick practical learning of blazing-fast library for machine learning and deep learning projects
From Everand
Google JAX Essentials: A quick practical learning of blazing-fast library for machine learning and deep learning projects
Mei Wong
No ratings yet
Computer Science, Career and Job
From Everand
Computer Science, Career and Job
Ramkrishna Ghosh
No ratings yet
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
From Everand
DESIGN ALGORITHMS TO SOLVE COMMON PROBLEMS: Mastering Algorithm Design for Practical Solutions (2024 Guide)
ARCHER PAUL
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Core Java Programming Book
From Everand
Core Java Programming Book
Manish Soni
No ratings yet
Random Sample Consensus: Robust Estimation in Computer Vision
From Everand
Random Sample Consensus: Robust Estimation in Computer Vision
Fouad Sabry
No ratings yet
Google JAX Cookbook
From Everand
Google JAX Cookbook
Zephyr Quent
5/5 (1)
Google JAX Cookbook: Perform machine learning and numerical computing with combined capabilities of TensorFlow and NumPy
From Everand
Google JAX Cookbook: Perform machine learning and numerical computing with combined capabilities of TensorFlow and NumPy
Zephyr Quent
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
From Everand
Mastering Go A Practical Guide to Developers: A Practical Guide to Developers
Miguel Miranda de Mattos
No ratings yet
The Beginner’s Guide to JavaScript
From Everand
The Beginner’s Guide to JavaScript
Steven Mcananey
No ratings yet
Machine Learning: Hands-On for Developers and Technical Professionals
From Everand
Machine Learning: Hands-On for Developers and Technical Professionals
Jason Bell
No ratings yet
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
From Everand
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
Bob Mather
5/5 (1)
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
From Everand
Machine Learning in Python: Hands on Machine Learning with Python Tools, Concepts and Techniques
Abiprod Pty Ltd
5/5 (10)
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Advanced JavaScript Design Patterns
From Everand
Advanced JavaScript Design Patterns
Hernando Abella
No ratings yet
Software Design Simplified
From Everand
Software Design Simplified
Liviu Catalin Dorobantu
No ratings yet
Essential Algorithms: A Practical Approach to Computer Algorithms
From Everand
Essential Algorithms: A Practical Approach to Computer Algorithms
Rod Stephens
4.5/5 (2)
Advanced Backend Code Optimization
From Everand
Advanced Backend Code Optimization
Sid Touati
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
From Everand
JavaScript. A Comprehensive manual for creating dynamic, responsive websites and applications: Suitable For Both Novice And Experts.
Abdulrazak Nugwa Ibrahim
5/5 (1)
Projects With Microcontrollers And PICC
From Everand
Projects With Microcontrollers And PICC
Guillermo Perez Guillen
5/5 (1)
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
From Everand
IGNOU PGDCA MCS 206 Object Oriented Programming using Java Previous Years solved Papers
Manish Soni
No ratings yet
How to a Developers Guide to 4k: Developer edition, #3
From Everand
How to a Developers Guide to 4k: Developer edition, #3
Xinc Cyberwizard
No ratings yet
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
From Everand
Oracle Certified Professional Java Programmer OCPJP 1Z0 809
Manish Soni
No ratings yet
Python Data Science Cookbook
From Everand
Python Data Science Cookbook
Taryn Voska
No ratings yet
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
From Everand
Python Data Science Cookbook: Practical solutions across fast data cleaning, processing, and machine learning workflows with pandas, NumPy, and scikit-learn
Taryn Voska
No ratings yet
Modern C++23 QuickStart Pro: Advanced programming including variadic templates, lambdas, async IO, multithreading and thread sync
From Everand
Modern C++23 QuickStart Pro: Advanced programming including variadic templates, lambdas, async IO, multithreading and thread sync
Jarek Thalor
No ratings yet
Modern C++23 QuickStart Pro
From Everand
Modern C++23 QuickStart Pro
Jarek Thalor
No ratings yet
Compiler Frontiers Unveiled
From Everand
Compiler Frontiers Unveiled
Azhar ul Haque Sario
No ratings yet
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
From Everand
C# Package Mastery: 100 Essentials in 1 Hour - 2024 Edition
Tenko
No ratings yet
Mnist
No ratings yet
Mnist
3 pages
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Couchbase Certified Java Developer - Exam Practice Tests
From Everand
Couchbase Certified Java Developer - Exam Practice Tests
Cristian Scutaru
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
OpenCL Programming by Example
From Everand
OpenCL Programming by Example
Ravishekhar Banger
No ratings yet
Dataflow and Reactive Programming Systems
From Everand
Dataflow and Reactive Programming Systems
Matt Carkci
No ratings yet
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
From Everand
Practical C++ Backend Programming: Crafting Databases, APIs, and Web Servers for High-Performance Backend
Justin Barbara
No ratings yet
Basic Information About C language PDF
From Everand
Basic Information About C language PDF
Suraj Das
No ratings yet
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
From Everand
Lexicon of Programming Terminology: Lexicon of Tech and Business, #17
Mustafa Al-Dori
5/5 (1)
Analysis and Design of Algorithms: A Beginner’s Hope
From Everand
Analysis and Design of Algorithms: A Beginner’s Hope
Shefali Singhal
No ratings yet
3ee
No ratings yet
3ee
5 pages
3ff
No ratings yet
3ff
5 pages
1ff
No ratings yet
1ff
5 pages
2ff
No ratings yet
2ff
5 pages
4ee
No ratings yet
4ee
5 pages
1ee
No ratings yet
1ee
5 pages
4bb
No ratings yet
4bb
5 pages
3aa
No ratings yet
3aa
5 pages
5ee
No ratings yet
5ee
5 pages
2f
No ratings yet
2f
5 pages
1aa
No ratings yet
1aa
5 pages
2aa
No ratings yet
2aa
5 pages
5g
No ratings yet
5g
5 pages
2e
No ratings yet
2e
5 pages
2g
No ratings yet
2g
5 pages
1g
No ratings yet
1g
5 pages
3c
No ratings yet
3c
5 pages
2d
No ratings yet
2d
5 pages
1a
No ratings yet
1a
5 pages
1e
No ratings yet
1e
5 pages
1
No ratings yet
1
5 pages
3e
No ratings yet
3e
5 pages
4a
No ratings yet
4a
5 pages
1c
No ratings yet
1c
5 pages
Arabiya Two
No ratings yet
Arabiya Two
7 pages
2
No ratings yet
2
5 pages
如何撰写mla参考书目
100% (2)
如何撰写mla参考书目
10 pages
My Dog Does My Homework Poem Book
100% (1)
My Dog Does My Homework Poem Book
4 pages
Friend or Girlfriend - CodeChef
No ratings yet
Friend or Girlfriend - CodeChef
3 pages
A Textbook of Marxist Philosophy
No ratings yet
A Textbook of Marxist Philosophy
194 pages
Will Vs Going To-4 INGLES
100% (1)
Will Vs Going To-4 INGLES
2 pages
Week of Work #2 10th Grade
No ratings yet
Week of Work #2 10th Grade
6 pages
Mod 1 Presentation1 DualP
No ratings yet
Mod 1 Presentation1 DualP
48 pages
FinQuest App Pitch Deck
No ratings yet
FinQuest App Pitch Deck
10 pages
Blix The Programmer
No ratings yet
Blix The Programmer
7 pages
Simple Formal Letter
No ratings yet
Simple Formal Letter
9 pages
Unit 2 Answers
No ratings yet
Unit 2 Answers
15 pages
Possessive Adjectives - Exercício I
No ratings yet
Possessive Adjectives - Exercício I
1 page
19 the Young Mum Who Changed the High Street
No ratings yet
19 the Young Mum Who Changed the High Street
7 pages
NL Camera Viewer Datasheet
No ratings yet
NL Camera Viewer Datasheet
1 page
Resolve Cross Cultural Misunderstanding
No ratings yet
Resolve Cross Cultural Misunderstanding
4 pages
(Ebook) Effective Team Management with VSTS and TFS: A Guide for Scrum Masters by Chandrasekara, Chaminda, Yapa, Sanjaya ISBN 9781484235577, 1484235576 instant download
100% (5)
(Ebook) Effective Team Management with VSTS and TFS: A Guide for Scrum Masters by Chandrasekara, Chaminda, Yapa, Sanjaya ISBN 9781484235577, 1484235576 instant download
58 pages
Log Old
No ratings yet
Log Old
1,340 pages
Instruction Manual
No ratings yet
Instruction Manual
5 pages
Rish SI-101 CONFIGURATION SETTING FOR INPUT SELECTION
No ratings yet
Rish SI-101 CONFIGURATION SETTING FOR INPUT SELECTION
9 pages
Oet Study Guide 2023
0% (2)
Oet Study Guide 2023
15 pages
Sona_SFDC_Resume
No ratings yet
Sona_SFDC_Resume
6 pages
Pseudo Incircles - Stanley Rabinowitz
No ratings yet
Pseudo Incircles - Stanley Rabinowitz
9 pages
Modals 16181
No ratings yet
Modals 16181
3 pages
1 Gauss-Jordan Lapena
No ratings yet
1 Gauss-Jordan Lapena
4 pages
Steam Apis
No ratings yet
Steam Apis
2 pages
Expression of Quantity IM43B Quiz 5010 Expression of Quantity IM43B Group
No ratings yet
Expression of Quantity IM43B Quiz 5010 Expression of Quantity IM43B Group
11 pages
SurvivalPhrases Arabic S1L02
No ratings yet
SurvivalPhrases Arabic S1L02
2 pages
Soal Primagama UTBK 01
No ratings yet
Soal Primagama UTBK 01
4 pages
Eled 3221 - Summary and Reflection
No ratings yet
Eled 3221 - Summary and Reflection
4 pages
Din 6885
50% (2)
Din 6885
1 page

5aaa

Uploaded by

5aaa

Uploaded by

5 "agent": [{

Code 8.2: A3C spec file to play Atari Pong.

1 conda activate lab

Code 8.3: A3C: training an agent

As usual, this will run a training Trial to produce the graphs

You might also like