100% found this document useful (6 votes)

146 views

Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser 2024 Scribd Download

Deep

Uploaded by

saidohadiyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (6 votes)

146 views

Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser 2024 Scribd Download

Deep

Uploaded by

saidohadiyo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 62

Download the full version of the textbook now at textbookfull.

com

Foundations of Deep Reinforcement Learning

Theory and Practice in Python 1st Edition
Laura Graesser

https://textbookfull.com/product/foundations-of-
deep-reinforcement-learning-theory-and-practice-
in-python-1st-edition-laura-graesser/

Explore and download more textbook at https://textbookfull.com

Recommended digital products (PDF, EPUB, MOBI) that
you can download immediately if you are interested.

Foundations of Deep Reinforcement Learning Theory and

Practice in Python First Edition Laura Graesser

https://textbookfull.com/product/foundations-of-deep-reinforcement-
learning-theory-and-practice-in-python-first-edition-laura-graesser/

textbookfull.com

Deep Reinforcement Learning in Action 1st Edition

Alexander Zai

https://textbookfull.com/product/deep-reinforcement-learning-in-
action-1st-edition-alexander-zai/

textbookfull.com

Grokking Deep Reinforcement Learning First Edition Miguel

Morales

https://textbookfull.com/product/grokking-deep-reinforcement-learning-
first-edition-miguel-morales/

textbookfull.com

Web Information Systems Engineering WISE 2014 15th

International Conference Thessaloniki Greece October 12 14
2014 Proceedings Part II 1st Edition Boualem Benatallah
https://textbookfull.com/product/web-information-systems-engineering-
wise-2014-15th-international-conference-thessaloniki-greece-
october-12-14-2014-proceedings-part-ii-1st-edition-boualem-benatallah/
textbookfull.com
The Dynamism of Civil Procedure - Global Trends and
Developments 1st Edition Colin B. Picker

https://textbookfull.com/product/the-dynamism-of-civil-procedure-
global-trends-and-developments-1st-edition-colin-b-picker/

textbookfull.com

Wisconsin Off the Beaten Path Discover Your Fun 11th

Edition Martin Hintz

https://textbookfull.com/product/wisconsin-off-the-beaten-path-
discover-your-fun-11th-edition-martin-hintz/

textbookfull.com

The Giant Vesicle Book 1st Edition Rumiana Dimova (Editor)

https://textbookfull.com/product/the-giant-vesicle-book-1st-edition-
rumiana-dimova-editor/

textbookfull.com

Livestock Ration Formulation for Dairy Cattle and Buffalo

1st Edition Ravinder Singh Kuntal

https://textbookfull.com/product/livestock-ration-formulation-for-
dairy-cattle-and-buffalo-1st-edition-ravinder-singh-kuntal/

textbookfull.com

A Post WTO International Legal Order Utopian Dystopian and

Other Scenarios Meredith Kolsky Lewis

https://textbookfull.com/product/a-post-wto-international-legal-order-
utopian-dystopian-and-other-scenarios-meredith-kolsky-lewis/

textbookfull.com
CompTIA Network+ Certification Study Guide, Seventh
Edition (Exam N10-007) Clarke

https://textbookfull.com/product/comptia-network-certification-study-
guide-seventh-edition-exam-n10-007-clarke/

textbookfull.com
Praise for Foundations of Deep
Reinforcement Learning
“This book provides an accessible introduction to deep reinforcement learning covering
the mathematical concepts behind popular algorithms as well as their practical
implementation. I think the book will be a valuable resource for anyone looking to apply
deep reinforcement learning in practice.”
—Volodymyr Mnih, lead developer of DQN

“An excellent book to quickly develop expertise in the theory, language, and practical
implementation of deep reinforcement learning algorithms. A limpid exposition which
uses familiar notation; all the most recent techniques explained with concise, readable
code, and not a page wasted in irrelevant detours: it is the perfect way to develop a solid
foundation on the topic.”
—Vincent Vanhoucke, principal scientist, Google

“As someone who spends their days trying to make deep reinforcement learning methods
more useful for the general public, I can say that Laura and Keng’s book is a welcome
addition to the literature. It provides both a readable introduction to the fundamental
concepts in reinforcement learning as well as intuitive explanations and code for many of
the major algorithms in the field. I imagine this will become an invaluable resource for
individuals interested in learning about deep reinforcement learning for years to come.”
—Arthur Juliani, senior machine learning engineer, Unity Technologies

“Until now, the only way to get to grips with deep reinforcement learning was to slowly
accumulate knowledge from dozens of different sources. Finally, we have a book bringing
everything together in one place.”
—Matthew Rahtz, ML researcher, ETH Zürich
Foundations
of Deep
Reinforcement
Learning
The Pearson Addison-Wesley
Data & Analytics Series

Visit informit.com/awdataseries for a complete list of available publications.

T he Pearson Addison-Wesley Data & Analytics Series provides readers with

practical knowledge for solving problems and answering questions with data.
Titles in this series primarily focus on three areas:
1. Infrastructure: how to store, move, and manage data
2. Algorithms: how to mine intelligence or make predictions based on data
3. Visualizations: how to represent data and insights in a meaningful and

compelling way

The series aims to tie all three of these areas together to help the reader build
end-to-end systems for fighting spam; making recommendations; building
personalization; detecting trends, patterns, or problems; and gaining insight
from the data exhaust of systems and user interactions.

Make sure to connect with us!

informit.com/socialconnect
Foundations
of Deep
Reinforcement
Learning

Theory and Practice

in Python

Laura Graesser
Wah Loon Keng

Boston • Columbus • New York • San Francisco • Amsterdam • Cape Town

Dubai • London • Madrid • Milan • Munich • Paris • Montreal • Toronto • Delhi • Mexico City
São Paulo • Sydney • Hong Kong • Seoul • Singapore • Taipei • Tokyo
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as
trademarks. Where those designations appear in this book, and the publisher was aware of a trademark
claim, the designations have been printed with initial capital letters or in all capitals.

The authors and publisher have taken care in the preparation of this book, but make no expressed or
implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed
for incidental or consequential damages in connection with or arising out of the use of the information or
programs contained herein.

For information about buying this title in bulk quantities, or for special sales opportunities (which may
include electronic versions; custom cover designs; and content particular to your business, training goals,
marketing focus, or branding interests), please contact our corporate sales department
at [email protected] or (800) 382-3419.

For government sales inquiries, please contact [email protected].

For questions about sales outside the U.S., please contact [email protected].

Visit us on the Web: informit.com/aw

Library of Congress Control Number: 2019948417

Copyright © 2020 Pearson Education, Inc.

Cover illustration by Wacomka/Shutterstock

SLM Lab is an MIT-licensed open source project.

All rights reserved. This publication is protected by copyright, and permission must be obtained from the
publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or
by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding
permissions, request forms and the appropriate contacts within the Pearson Education Global Rights &
Permissions Department, please visit www.pearson.com/permissions.

ISBN-13: 978-0-13-517238-4
ISBN-10: 0-13-517238-1

3 20
For those people who make me feel that anything is possible
—Laura

For my wife Daniela

—Keng
This page intentionally left blank
Contents

Foreword xix

Preface xxi

Acknowledgments xxv

About the Authors xxvii

1 Introduction to Reinforcement Learning 1

1.1 Reinforcement Learning 1
1.2 Reinforcement Learning as MDP 6
1.3 Learnable Functions in Reinforcement
Learning 9
1.4 Deep Reinforcement Learning
Algorithms 11
1.4.1 Policy-Based Algorithms 12
1.4.2 Value-Based Algorithms 13
1.4.3 Model-Based Algorithms 13
1.4.4 Combined Methods 15
1.4.5 Algorithms Covered in This
Book 15
1.4.6 On-Policy and Off-Policy
Algorithms 16
1.4.7 Summary 16
1.5 Deep Learning for Reinforcement
Learning 17
1.6 Reinforcement Learning and Supervised
Learning 19
1.6.1 Lack of an Oracle 19
1.6.2 Sparsity of Feedback 20
1.6.3 Data Generation 20
1.7 Summary 21
x Contents

I Policy-Based and Value-Based

Algorithms 23

2 REINFORCE 25
2.1 Policy 26
2.2 The Objective Function 26
2.3 The Policy Gradient 27
2.3.1 Policy Gradient
Derivation 28
2.4 Monte Carlo Sampling 30
2.5 REINFORCE Algorithm 31
2.5.1 Improving
REINFORCE 32
2.6 Implementing REINFORCE 33
2.6.1 A Minimal REINFORCE
Implementation 33
2.6.2 Constructing Policies with
PyTorch 36
2.6.3 Sampling Actions 38
2.6.4 Calculating Policy
Loss 39
2.6.5 REINFORCE Training
Loop 40
2.6.6 On-Policy Replay
Memory 41
2.7 Training a REINFORCE Agent 44
2.8 Experimental Results 47
2.8.1 Experiment: The Effect of
Discount Factor γ 47
2.8.2 Experiment: The Effect of
Baseline 49
2.9 Summary 51
2.10 Further Reading 51
2.11 History 51

3 SARSA 53
3.1 The Q- and V-Functions 54
3.2 Temporal Difference Learning 56
3.2.1 Intuition for Temporal
Difference Learning 59
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Contents xi

3.3 Action Selection in SARSA 65

3.3.1 Exploration and
Exploitation 66
3.4 SARSA Algorithm 67
3.4.1 On-Policy Algorithms 68
3.5 Implementing SARSA 69
3.5.1 Action Function: ε-Greedy 69
3.5.2 Calculating the Q-Loss 70
3.5.3 SARSA Training Loop 71
3.5.4 On-Policy Batched Replay
Memory 72
3.6 Training a SARSA Agent 74
3.7 Experimental Results 76
3.7.1 Experiment: The Effect of
Learning Rate 77
3.8 Summary 78
3.9 Further Reading 79
3.10 History 79

4 Deep Q-Networks (DQN) 81

4.1 Learning the Q-Function in DQN 82
4.2 Action Selection in DQN 83
4.2.1 The Boltzmann Policy 86
4.3 Experience Replay 88
4.4 DQN Algorithm 89
4.5 Implementing DQN 91
4.5.1 Calculating the Q-Loss 91
4.5.2 DQN Training Loop 92
4.5.3 Replay Memory 93
4.6 Training a DQN Agent 96
4.7 Experimental Results 99
4.7.1 Experiment: The Effect of
Network Architecture 99
4.8 Summary 101
4.9 Further Reading 102
4.10 History 102

5 Improving DQN 103

5.1 Target Networks 104
xii Contents

5.2 Double DQN 106

5.3 Prioritized Experience Replay
(PER) 109
5.3.1 Importance Sampling 111
5.4 Modified DQN Implementation 112
5.4.1 Network Initialization 113
5.4.2 Calculating the
Q-Loss 113
5.4.3 Updating the Target
Network 115
5.4.4 DQN with Target
Networks 116
5.4.5 Double DQN 116
5.4.6 Prioritized Experienced
Replay 117
5.5 Training a DQN Agent to Play Atari
Games 123
5.6 Experimental Results 128
5.6.1 Experiment: The Effect of
Double DQN and PER 128
5.7 Summary 132
5.8 Further Reading 132

II Combined Methods 133

6 Advantage Actor-Critic (A2C) 135

6.1 The Actor 136
6.2 The Critic 136
6.2.1 The Advantage
Function 136
6.2.2 Learning the Advantage
Function 140
6.3 A2C Algorithm 141
6.4 Implementing A2C 143
6.4.1 Advantage
Estimation 144
6.4.2 Calculating Value Loss and
Policy Loss 147
Contents xiii

6.4.3 Actor-Critic Training Loop 147

6.5 Network Architecture 148
6.6 Training an A2C Agent 150
6.6.1 A2C with n-Step Returns on
Pong 150
6.6.2 A2C with GAE on Pong 153
6.6.3 A2C with n-Step Returns on
BipedalWalker 155
6.7 Experimental Results 157
6.7.1 Experiment: The Effect of n-Step
Returns 158
6.7.2 Experiment: The Effect of λ of
GAE 159
6.8 Summary 161
6.9 Further Reading 162
6.10 History 162

7 Proximal Policy Optimization (PPO) 165

7.1 Surrogate Objective 165
7.1.1 Performance Collapse 166
7.1.2 Modifying the Objective 168
7.2 Proximal Policy Optimization (PPO) 174
7.3 PPO Algorithm 177
7.4 Implementing PPO 179
7.4.1 Calculating the PPO Policy
Loss 179
7.4.2 PPO Training Loop 180
7.5 Training a PPO Agent 182
7.5.1 PPO on Pong 182
7.5.2 PPO on BipedalWalker 185
7.6 Experimental Results 188
7.6.1 Experiment: The Effect of λ of
GAE 188
7.6.2 Experiment: The Effect of
Clipping Variable ε 190
7.7 Summary 192
7.8 Further Reading 192
xiv Contents

8 Parallelization Methods 195

8.1 Synchronous Parallelization 196
8.2 Asynchronous Parallelization 197
8.2.1 Hogwild! 198
8.3 Training an A3C Agent 200
8.4 Summary 203
8.5 Further Reading 204

9 Algorithm Summary 205

III Practical Details 207

10 Getting Deep RL to Work 209

10.1 Software Engineering Practices 209
10.1.1 Unit Tests 210
10.1.2 Code Quality 215
10.1.3 Git Workflow 216
10.2 Debugging Tips 218
10.2.1 Signs of Life 219
10.2.2 Policy Gradient
Diagnoses 219
10.2.3 Data Diagnoses 220
10.2.4 Preprocessor 222
10.2.5 Memory 222
10.2.6 Algorithmic Functions 222
10.2.7 Neural Networks 222
10.2.8 Algorithm
Simplification 225
10.2.9 Problem
Simplification 226
10.2.10 Hyperparameters 226
10.2.11 Lab Workflow 226
10.3 Atari Tricks 228
10.4 Deep RL Almanac 231
10.4.1 Hyperparameter
Tables 231
Contents xv

10.4.2 Algorithm Performance

Comparison 234
10.5 Summary 238

11 SLM Lab 239

11.1 Algorithms Implemented in SLM Lab 239
11.2 Spec File 241
11.2.1 Search Spec Syntax 243
11.3 Running SLM Lab 246
11.3.1 SLM Lab Commands 246
11.4 Analyzing Experiment Results 247
11.4.1 Overview of the Experiment
Data 247
11.5 Summary 249

13 Hardware 273
13.1 Computer 273
13.2 Data Types 278
13.3 Optimizing Data Types in RL 280
13.4 Choosing Hardware 285
13.5 Summary 285

IV Environment Design 287

14 States 289
14.1 Examples of States 289
14.2 State Completeness 296
14.3 State Complexity 297
14.4 State Information Loss 301
14.4.1 Image Grayscaling 301
14.4.2 Discretization 302
14.4.3 Hash Conflict 303
14.4.4 Metainformation
Loss 303
14.5 Preprocessing 306
14.5.1 Standardization 307
14.5.2 Image Preprocessing 308
14.5.3 Temporal
Preprocessing 310
14.6 Summary 313

15 Actions 315
15.1 Examples of Actions 315
15.2 Action Completeness 318
15.3 Action Complexity 319
15.4 Summary 323
15.5 Further Reading: Action Design in
Everyday Things 324

16 Rewards 327
16.1 The Role of Rewards 327
16.2 Reward Design Guidelines 328
16.3 Summary 332
Contents xvii

17 Transition Function 333

17.1 Feasibility Checks 333
17.2 Reality Check 335
17.3 Summary 337

Epilogue 338

A Deep Reinforcement Learning Timeline 343

B Example Environments 345

B.1 Discrete Environments 346
B.1.1 CartPole-v0 346
B.1.2 MountainCar-v0 347
B.1.3 LunarLander-v2 347
B.1.4 PongNoFrameskip-v4 348
B.1.5 BreakoutNoFrameskip-v4 349
B.2 Continuous Environments 350
B.2.1 Pendulum-v0 350
B.2.2 BipedalWalker-v2 350

References 353

Index 363
This page intentionally left blank
Foreword

In April of 2019, OpenAI’s Five bots played in a Dota 2 competition match against 2018
human world champions, OG. Dota 2 is a complex, multiplayer battle arena game where
players can choose different characters. Winning a game requires strategy, teamwork, and
quick decisions. Building an artificial intelligence to compete in this game, with so
many variables and a seemingly infinite search space for optimization, seems like an
insurmountable challenge. Yet OpenAI’s bots won handily and, soon after, went on to win
over 99% of their matches against public players. The innovation underlying this
achievement was deep reinforcement learning.
Although this development is recent, reinforcement learning and deep learning have
both been around for decades. However, a significant amount of new research combined
with the increasing power of GPUs have pushed the state of the art forward. This book
gives the reader an introduction to deep reinforcement learning and distills the work done
over the last six years into a cohesive whole.
While training a computer to beat a video game may not be the most practical thing to
do, it’s only a starting point. Reinforcement learning is an area of machine learning that is
useful for solving sequential decision-making problems—that is, problems that are solved
over time. This applies to almost any endeavor—be it playing a video game, walking down
the street, or driving a car.
Laura Graesser and Wah Loon Keng have put together an approachable introduction to
a complicated topic that is at the forefront of what is new in machine learning. Not only
have they brought to bear their research into many papers on the topic; they created an
open source library, SLM Lab, to help others get up and running quickly with deep
reinforcement learning. SLM Lab is written in Python on top of PyTorch, but readers only
need familiarity with Python. Readers intending to use TensorFlow or some other library
as their deep learning framework of choice will still get value from this book as it
introduces the concepts and problem formulations for deep reinforcement learning
solutions.
This book brings together the most recent research in deep reinforcement learning
along with examples and code that the readers can work with. Their library also works
with OpenAI’s Gym, Roboschool, and the Unity ML-Agents toolkit, which makes this
book a perfect jumping-off point for readers looking to work with those systems.

—Paul Dix, Series Editor

This page intentionally left blank
Visit https://textbookfull.com
now to explore a rich
collection of eBooks, textbook
and enjoy exciting offers!
Preface

We first discovered deep reinforcement learning (deep RL) when DeepMind achieved
breakthrough performance in the Atari arcade games. Using only images and no prior
knowledge, artificial agents reached human-level performance for the first time.
The idea of an artificial agent learning by itself, through trial and error, without
supervision, sparked something in our imaginations. It was a new and exciting approach to
machine learning, and it was quite different from the more familiar field of supervised
learning.
We decided to work together to learn about this topic. We read books and papers,
followed online courses, studied code, and tried to implement the core algorithms. We
realized that not only is deep RL conceptually challenging, but that implementation
requires as much effort as a large software engineering project.
As we progressed, we learned more about the landscape of deep RL—how algorithms
relate to each other and what their different characteristics are. Forming a mental model of
this was hard because deep RL is a new area of research and the theoretical knowledge had
not yet been distilled into a book. We had to learn directly from research papers and online
lectures.
Another challenge was the large gap between theory and implementation. Often, a deep
RL algorithm has many components and tunable hyperparameters that make it sensitive
and fragile. For it to succeed, all the components need to work together correctly and with
appropriate hyperparameter values. The implementation details required to get this right
are not immediately clear from the theory, but are just as important. A resource that
integrated theory and implementation would have been invaluable when we were learning.
We felt that the journey from theory to implementation could have been simpler than
we found it, and we wanted to contribute to making deep RL easier to learn. This book is
our attempt to do that. It takes an end-to-end approach to introducing deep RL—starting
with intuition, then explaining the theory and algorithms, and finishing with
implementations and practical tips. This is also why the book comes with a companion
software library, SLM Lab, which contains implementations of all the algorithms discussed
in it. In short, this is the book we wished existed when we were starting to learn about this
topic.
Deep RL belongs to the larger field of reinforcement learning. At the core of
reinforcement learning is function approximation; in deep RL, functions are learned using
deep neural networks. Reinforcement learning, along with supervised and unsupervised
learning, make up the three core machine learning techniques, and each technique differs
in how problems are formulated and how algorithms learn from data.
In this book we focus exclusively on deep RL because the challenges we experienced
are specific to this subfield of reinforcement learning. This bounds the scope of the book
xxii Preface

in two ways. First, it excludes all other techniques that can be used to learn functions
in reinforcement learning. Second, it emphasizes developments between 2013 and
2019 even though reinforcement learning has existed since the 1950s. Many of the
recent developments build from older research, so we felt it was important to trace the
development of the main ideas. However, we do not intend to give a comprehensive
history of the field.
This book is aimed at undergraduate computer science students and software engineers.
It is intended to be an introduction to deep RL and no prior knowledge of the subject is
required. However, we do assume that readers have a basic familiarity with machine
learning and deep learning as well as an intermediate level of Python programming. Some
experience with PyTorch is also useful but not necessary.
The book is organized as follows. Chapter 1 introduces the different aspects of a deep
reinforcement learning problem and gives an overview of deep reinforcement learning
algorithms.
Part I is concerned with policy-based and value-based algorithms. Chapter 2 introduces
the first Policy Gradient method known as REINFORCE. Chapter 3 introduces the first
value-based method known as SARSA. Chapter 4 discusses the Deep Q-Networks
(DQN) algorithm and Chapter 5 focuses on techniques for improving it—target
networks, the Double DQN algorithm, and Prioritized Experience Replay.
Part II focuses on algorithms which combine policy-based and value-based methods.
Chapter 6 introduces the Actor-Critic algorithm which extends REINFORCE.
Chapter 7 introduces Proximal Policy Optimization (PPO) which can extend
Actor-Critic. Chapter 8 discusses synchronous and asynchronous parallelization techniques
that are applicable to any of the algorithms in this book. Finally, all the algorithms are
summarized in Chapter 9.
Each algorithm chapter is structured in the same way. First, we introduce the main
concepts and work through the relevant mathematical formulations. Then we describe
the algorithm and discuss an implementation in Python. Finally, we provide a configured
algorithm with tuned hyperparameters which can be run in SLM Lab, and illustrate the
main characteristics of the algorithm with graphs.
Part III focuses on the practical details of implementing deep RL algorithms.
Chapter 10 covers engineering and debugging practices and includes an almanac of
hyperparameters and results. Chapter 11 provides a usage reference for the companion
library, SLM Lab. Chapter 12 looks at neural network design and Chapter 13 discusses
hardware.
The final part of book, Part IV, is about environment design. It consists of Chapters 14,
15, 16, and 17 which treat the design of states, actions, rewards, and transition functions
respectively.
The book is intended to be read linearly from Chapter 1 to Chapter 10. These chapters
introduce all of the algorithms in the book and provide practical tips for getting them to
work. The next three chapters, 11 to 13, focus on more specialized topics and can be read
Preface xxiii

in any order. For readers that do not wish to go into as much depth, Chapters 1, 2, 3, 4, 6,
and 10 are a coherent subset of the book that focuses on a few of the algorithms. Finally,
Part IV contains a standalone set of chapters intended for readers with a particular interest
in understanding environments in more depth or building their own.
SLM Lab [67], this book’s companion software library, is a modular deep RL
framework built using PyTorch [114]. SLM stands for Strange Loop Machine, in homage
to Hofstadter’s iconic book Gödel, Escher, Bach: An Eternal Golden Braid [53]. The specific
examples from SLM Lab that we include use PyTorch’s syntax and features for training
neural networks. However, the underlying principles for implementing deep RL
algorithms are applicable to other deep learning frameworks such as TensorFlow [1].
The design of SLM Lab is intended to help new students learn deep RL by organizing
its components into conceptually clear pieces. These components also align with how deep
RL is discussed in the academic literature to make it easier to translate from theory to code.
Another important aspect of learning deep RL is experimentation. To facilitate this,
SLM Lab also provides an experimentation framework to help new students design and
test their own hypotheses.
The SLM Lab library is released as an open source project on Github. We encourage
readers to install it (on a Linux or MacOS machine) and run the first demo by following
the instructions on the repository website https://github.com/kengz/SLM-Lab. A
dedicated git branch “book” has been created with a version of code compatible with this
book. A short installation instruction copied from the repository website is shown in
Code 0.1.

Code 0.1 Installing SLM-Lab from the book git branch

1 # clone the repository

2 git clone https://github.com/kengz/SLM-Lab.git
3 cd SLM-Lab
4 # checkout the dedicated branch for this book
5 git checkout book
6 # install dependencies
7 ./bin/setup
8 # next, follow the demo instructions on the repository website

We recommend you set this up first so you can train agents with algorithms as they are
introduced in this book. Beyond installation and running the demo, it is not necessary to
be familiar with SLM Lab before reading the algorithm chapters (Parts I and II)—we give
all the commands to train agents where needed. We also discuss SLM Lab more extensively
in Chapter 11 after shifting focus from algorithms to more practical aspects of deep
reinforcement learning.
xxiv Preface

Register your copy of Foundations of Deep Reinforcement Learning on the InformIT site for
convenient access to updates and/or corrections as they become available. To start the reg-
istration process, go to informit.com/register and log in or create an account. Enter the
product ISBN (9780135172384) and click Submit. Look on the Registered Products tab
for an Access Bonus Content link next to this product, and follow that link to access any
available bonus materials. If you would like to be notified of exclusive offers on new editions
and updates, please check the box to receive email from us.
Acknowledgments

There are many people who have helped us finish this project. We thank Milan Cvitkovic,
Alex Leeds, Navdeep Jaitly, Jon Krohn, Katya Vasilaky, and Katelyn Gleason for supporting
and encouraging us. We are grateful to OpenAI, PyTorch, Ilya Kostrikov, and Jamromir
Janisch for providing high-quality open source implementations of different components
of deep RL algorithms. We also thank Arthur Juliani for early discussions on environment
design. These resources and discussions were invaluable as we were building SLM Lab.
A number of people provided thoughtful and insightful feedback on earlier drafts of this
book. We would like to thank Alexandre Sablayrolles, Anant Gupta, Brandon Strickland,
Chong Li, Jon Krohn, Jordi Frank, Karthik Jayasurya, Matthew Rahtz, Pidong Wang,
Raymond Chua, Regina R. Monaco, Rico Jonschkowski, Sophie Tabac, and Utku Evci
for the time and effort you put into this. The book is better as a result.
We are very grateful to the Pearson production team—Alina Kirsanova, Chris Zahn,
Dmitry Kirsanov, and Julie Nahil. Thanks to your thoughtfulness, care, and attention to
detail the text has been greatly improved.
Finally, this book would not exist without our editor Debra Williams Cauley. Thank
you for your patience and encouragement, and for helping us to see that writing a book
was possible.
Random documents with unrelated
content Scribd suggests to you:
He said he was apprised of a speculation going on
in the stock of the Morrison; he intended to EAVESDROPP
embark in it—wished them to hold back their stock, ERS.
and aid his views in effecting a rise, and he would
aid them in disposing of theirs at the right time. He did not tell them
that he was the sole author of the speculation; he modestly forebore
that, under the plea that he did not feel at liberty to tell all that he
knew. But the directors were almost immediately confirmed in their
good opinion of his knowledge and sagacity, as well as of his
intention to do them a service, by hearing of the large purchases of
Mr. Bottomly, at a considerable advance on the previous market
value.

It is curious, as well as amusing, to see how many and how slight

causes sometimes tend to aid or to frustrate a speculator in his
designs.

There is a set of proscribed men in Wall-street, who were once

brokers, professionally, and are now broken in reputation, credit and
finances—having no means but what they have kept from their
creditors, and—being expelled from the exchange board for
defalcation or bad conduct—they still linger round their old haunts,
and carry on a system of gambling in what are termed fancy stocks,
through which they contrive occasionally to entrap and empty the
purse of some newcomer, or filch each other of their ill retained
means, until each in his turn gets placed on the “black list,” which is
the final seal of reprobation, and in Wall-street signifies—“that
whoever deals with that man, shall himself not be dealt with by any
one.”

These gentry may be seen, daily, in squads of

three, four, and five, standing on the side walks, or EAVESDROPP
on the steps of the large offices, talking ERS.
vociferously, and making such bids and offers in
the funds, as that one not knowing better, would suppose that each
held the finances of the country in his palm.
One of them, whom I will call Mr. Eavesdropper, had his ear timidly
placed at the key hole of the Stock Exchange, and heard Mr.
Bottomly’s bids for Morrison, which instantly infused such courage
into his mind, and activity into his limbs, that he went out, and
before the transactions of the board were publicly known, he had
privately contracted for the delivery of a large number of shares; and
then, to aid his purpose in effecting a further rise, gave out that he
was in confidence with John Jacob Astor, or some one no less
powerful, who had determined to buy up the whole company.

As soon as Mr. Friendly had returned to his office

from the meeting of the Board of Brokers, on the MR.
day I am speaking of, and, as is common, had laid SPRIGGINS.
open on his outer desk, for public inspection, the
record of the transactions which had there taken place—in which he
had carefully noted all those in the stock of the Morrison—he walked
out into the street, where he encountered Mr. Spriggins, who,
contrary to his wont, for some cause, had not been present at the
board that morning.

Mr. Spriggins is one of those gentlemen, whose conceit of himself

supplies the place of education, manners, and intellect, and he
accosted Mr. Friendly as follows:

“Dan, what does all this speculation in the Morrison mean?”

Mr. Friendly, whose polite manners encouraged the freedom of such

men as his friend Bottomly, held in contempt the rudeness of
impudence, and, instead of answering Spriggins, turned on his heel
without noticing him.

Spriggins, who could never imagine the existence

of such a feeling towards a gentleman like himself, THE FEVER
attributed this treatment entirely to another UP.
motive; and knowing Mr. Friendly’s speculative
character, he at once imagined that Friendly had a secret that he
wished to keep, and improve for his own benefit, and he
immediately resolved in his own mind to outwit him. Upon this
impulse he hurried away to purchase all the Morrison stock he could
get hold of—which he did at a large advance of price over the
purchases of Mr. Bottomly; and when he had done so, exultingly told
Mr. Friendly that he was “not to be come over.” “But never mind,”
said he, “you can keep your secret now, Dan; but if you are so
disposed, as you know what is going on, we will operate together.”

Mr. Friendly was not disposed to embrace this offer, liberal as it was,
and maintained such a reserve as excited still further the cupidity of
Spriggins. Meantime, having heard of the operations of
Eavesdropper, Mr. Friendly perceived that he had fairly put the match
to a train that, if properly fed, would lead to an explosion. But he
resolved that before it should happen, he would take good care of
himself.

There is another set of men in Wall-street, which

demand my description. They have neither trade LOAFERS.
nor profession of any kind, and if they ever had
any, they have abandoned it. Some of them are of that class called
Gentlemen, who have married fortunes and squandered them—some
are broken merchants—some disgraced politicians—defunct post
masters, &c. &c.,—and all of them are Loafers. They have neither
wit enough to contrive, nor credit enough to carry out, a
speculation: but when one has begun, like that I am now describing,
they may be seen flocking in and out of the brokers’ offices—
examining the stock books—talking wisely of the nation’s affairs—
each one pretending to know more of finance than even Mr.
Woodbury himself; and their exuberance of knowledge is almost as
luminously exhibited. Like flies round a honey pot, each one is
anxious for a sip, and according to his slender means, pledges a
hundred dollars, more or less, and orders his broker to buy as many
shares as he will upon this security. They thus materially aid the
great speculators; but the result to themselves generally is, that
their families or friends suffer precisely the amount they have risked
—and so it was in this instance.

Mr. Friendly continued to purchase largely of the

Morrison stock, which increased the excitement, DEEP IN FOR
and continually advanced the price; and Mr. IT.
Spriggins, nettled by Friendly’s reserve towards
him, continued to be a large purchaser also, and induced several of
his friends to join him.

It ought to be observed here, that these purchases were generally

made “on time:” that is, the stock was agreed to be delivered at a
future day. And it so happened that when Spriggins was the buyer,
Mr. Friendly was generally the seller, through some other persons, as
his agents; and he had taken care so to provide himself with stock,
that he could meet his contracts on time—not only without the
danger of loss, but with a certain profit.

So long as the stock maintained the very high price

to which it had now advanced, all was well; but the HOW SOME
time must come, when it would not, and then, PEOPLE
NEGOTIATE
there was danger that Spriggins, and his LOANS.
compatriots in the speculation, would not be able
to fulfill their engagements. Mr. Friendly, therefore, as a stimulant to
the action of Spriggins, and to prepare for the denouement, hinted
to him that, as the stock had now so much increased in value,
probably Nicholas the 1st would loan money liberally upon it; and, as
the stock must rise still more, such an accommodation would be
very desirable, to enable one to hold it. He dropped this hint in such
a way, as led Spriggins to believe that he intended to make the
application for himself. He had really, however, no intention of asking
such a favor, for such a purpose, but he knew well what the effect of
such a suggestion would be with Spriggins, and—as he expected—
the dapper gentleman immediately started, post haste, for
Philadelphia, and succeeded in obtaining from Nicholas, the promise
of a loan of one hundred and fifty thousand dollars, on stock of the
Morrison, which, six weeks before, was not worth, in the market,
one-fourth of that amount: that is, provided Mr. Spriggins would
negotiate half a million of dollars of the bonds of the U. S. B.,
payable in twelve months from date; by which means, Nicholas
cunningly foresaw, that, instead of lending Spriggins, he should
himself be the borrower of no less a sum than three hundred and
fifty thousand dollars. But those were palmy days of credit: the
bonds were all negotiated with ease, and the Morrison stock
transferred to Nicholas, as security for the loan, as fast as the
greedy purchases of Mr. Spriggins could command the large amount
necessary.

Mr. Friendly now saw all his hopes about to be

realized. By obtaining the control of a very large FRIENDLY’S
amount of the stock at 20 to 25 per cent. on the PLAN
par, and stimulating the cupidity and self conceit of CONSUMMATE
D.
Spriggins, he constantly went on buying ten shares
at an increased price, and always selling, through some one else,
twenty shares to every ten that he bought, until he succeeded in
throwing the whole stock upon Spriggins, and his associates, who,
as we have seen, were supported by Nicholas, in their mad
speculation. And, in less than four months from his first purchase,
Mr. Friendly retired, with a clear profit of one hundred thousand
dollars, and the eternal gratitude of the directors of the Morrison,
whom he had assisted and relieved.

The directors of the Morrison rubbed their hands

with glee, and treated Mr. Friendly with the THE
greatest respect, when they found that they had DIRECTORS
got rid of all their stock, not only without loss, but STUCK FAST.
at an enormous profit. But then came the affairs of the company,
which were not a whit better now, than when the stock was
depressed. In fact, they were every day growing worse—for the
interest on their enormous loans, was now becoming due, and they
had nothing to pay it with.
At a meeting of the board of directors, about the time we are
speaking of, for purposes of business, they all sat in silence, for
some minutes, looking at each other—each one wishing that his
neighbor might propose some remedy—and their hearts sank within
them, as each one successively uttered a desponding sigh at the
poverty of his invention, or—what was still worse to get over—the
poverty of their finances. At length, Mr. Faintheart proposed that
they should all resign, and let the affairs of the company take care of
themselves. This course—although so successfully practised some
time afterwards by their successors in office—did not suit the taste
of Mr. Hold-on; who said that “the public mind was not now
prepared for such a movement. A few years more, and the people
will be more enlightened, and will not expect directors to retain their
seats when they have nothing to gain by doing so. But, if we desert
them now, the thing will be looked into; we shall be accused of all
the roguery that others before us have done—we shall be hunted
like rats. My motto, therefore, is, ‘don’t give up the ship;’ and, as
none of you propose a remedy, I suggest that, as Nicholas now
holds a large amount of the stock, perhaps he will loan us a couple
of hundred thousands, especially, if he is made to believe that it is
wanted to finish our works of improvement, and that when they are
finished it will improve the value of his stock fifty per cent.”

The proposal was hailed with delight, and Mr. Hold-

on was deputed to manage the negotiation, which A WISE
he did with the same success, in the same manner, SUGGESTION
—ITS RESULT.
and on the same conditions, as did Mr. Spriggins.

This cunning device of Nicholas, always to be the borrower, when he

appeared to lend, is altogether a modern invention, and was very
successfully practised, a few years since, in an attempt to relieve the
merchants of New-York, and which, some have wickedly said, was
used to enable certain gentlemen to collect their private debts, and
finally went to relieve Nicholas of his money, by throwing the same
debts upon him. Be that as it may, it is universally admitted that
Nicholas was the first inventor of a bank, whose business was to
borrow, instead of lending, money.

By these master strokes of policy, Nicholas came

into possession of a claim of two hundred OLD NICK’S
thousand dollars against the Morrison, and held an SHREWDNESS
equal amount of their stock, both which in six .
months afterwards he found to be not worth a farthing. To these
circumstances it was owing, that, through his influence, the Morrison
Kennel was afterwards revived with great splendor, and scenes were
enacted, which, though they have caused many a tear of grief to
flow, will, I hope, not fail to excite your laughter by their relation.

My venerable friend here excused himself on account of fatigue, but

with the assurance that, at our next meeting, he would introduce me
to some new and original characters, and also tell me about the
negotiation of a great state loan—how the directors of the Morrison
helped to nurse the great “bull dog” of Pennsylvania in his sickness,
and how they all ran away, when they found he was likely to die,
together with some account of his disease, and his last moments.

If any one is curious to know what became of

Messrs. Friendly, Spriggins, and Eavesdropper, after HOW THE
their figure in the Speculation, I will take this OPERATORS
opportunity to tell them, that what has here been CAME OUT.
said is but a small part of the fame to which they are entitled; but it
is my business only to show the results of such things as I have
related.

Mr. Friendly, as all good men do, spent his money liberally and
charitably. To a large circle, his house was the centre of politeness,
elegance, and hospitality; but his insatiable appetite for speculation
ruined him at last, as it does all others; at least, he is so far ruined,
that until another speculation shall turn out like the Morrison, he will
be content to practice economy. Mr. Spriggins set up his carriage on
his anticipated profits, and was let down from it before his
coachman had fairly mounted his livery; and report says that he has
since done the same thing three times over. Mr. Eavesdropper ran
wild with his first success, and, in the end, only arrived one stride
nearer the “black list.” And exactly one hundred and forty others
were made poorer than they were before, by the whole amount
which they put at risk.

If the mischiefs arising from this species of

gambling were confined to those who set that kind RESULTS OF
of speculation on foot, or who make it the business GAMBLING.
of their lives, there would be nothing to lament.
But such is not the fact. The whirlwind naturally sweeps everything
within its influence, and over which it has power, to its centre.
Hundreds, nay thousands, allured by the success of a few, are
induced to embark in the rash adventure. They are unacquainted
with the real character or causes of the fluctuations, and even if they
are not ignorant, they are liable to be outwitted by some of the
hundred minds that are continually plotting against them. Here is the
fruitful source of all those defalcations of public officers—the
purloining of money by officers, and clerks of institutions which have
so much multiplied of late—inflicting misery on the families, and
disgrace on the names of many a once honorable man.

The business is showy and fascinating, its

temptations subtle and alluring. Young and GOOD
enterprising merchants embark in it, and find that LESSONS TO
THOSE WHO
they have lost their money and their credit, when it WILL HEAR.
is too late to repent. The hard working mechanic
embarks a few hundred dollars, and when he finds himself in debt
and no means of paying, sees that he has been outwitted, and that,
if others have made money, he has lost it. In fact that it is he, and
others like him, who have assisted the successful to pocket his
profits. Women yield up the savings of years to be invested in
something, they know not what, and wail over their folly and their
credulity, when all their bright hopes have faded into a worthless
certificate. In fact it is only the loss sustained by such as these that
enables the operators, as they are termed, to accumulate, or even to
live by their business; for they could do neither the one nor the
other within themselves. And it will be found to be a necessary
consequence, that in all such speculations as that of which I have
endeavored to give but too faithful a picture, if one is made rich, a
hundred are made poor. In no country in the world, is the hazard of
stock gambling, so great as in this, where there is such a multitude
of stocks, based upon the schemes of individuals, and affected by
the ever changing prosperity of our growing, yet comparatively
unsettled condition; and where the capitals are often so small, that it
is in the power of two or three designing individuals to raise them
above, or depress them below, their real value, as may best suit
their plans or their convenience, thus often destroying the sole
dependence of the needy and helpless.
CHAPTER III.

STATE STOCKS—HISTORY OF THE MORRISON KENNEL

CONTINUED—INTRODUCTION OF NEW CHARACTERS—
THE U. S. BANK, &C.

Among the many schemes of finance that have

disgraced our country, and been fruitful sources of BUBBLES.
peculation, deserving the severe censure of justice,
and the ridicule of wit, there are none more prominent than the
bubble of state stocks. Time was when that name had a signification
of value; but a new era is now upon us, the end of which, “is not
yet.” The character of our western population is one of hardy
enterprise, and great intelligence in their own particular sphere; but
by some strange fatuity they have overrated their credit, and in so
doing, have overreached themselves. That confidence of a western
man which induces him to believe that he can “whip his weight in
wild cats,” is no vain boast, but the natural consequence of that
hardihood, of vigor that comes of their spirit of enterprise; and it is
not to be wondered at, that the thrift of their situation should inspire
a similar confidence in other matters.

But, until they can learn to take better care of their

credit, and intrust the management of it to more HOW WISE
skilful and experienced hands, they will never MEN HAVE
cease to be cheated and dishonored. The following BEEN DUPED.
detail by my venerable and facetious friend, of a negotiation for the
sale of state bonds, would be true to life, if it were not too feebly
described, for as I write only from recollection I find it impossible to
do justice to his humor.

“Before,” said he, “I proceed to the story of the

state bonds, I must first relate to you what BUTTONING
changes took place in the Morrison.” As soon as UP.
the speculation detailed yesterday had ceased, and
Mr. Friendly had fairly retired with his profits, the stock went down,
down, down, and not a purchaser for a single share could be found
at any price; and as soon as the whole class of small speculators
perceived that they had been “stuck,” aware of the injury the little
credit they possessed would sustain, if they were known to have met
with a loss, they all shut their mouths, except, that each one pitied
his neighbor; and strange as it may seem, not a man could be found
in Wall-street, who confessed the ownership of a share; where three
weeks before there were thousands. This is called “buttoning up,”
and if any gentleman in a certain predicament was ever surprised by
a bevy of ladies suddenly turning the corner, he will know something
of the hurry felt, and the nonchalance assumed, on this occasion.

Six months from the time when Nicholas made the

loans before spoken of, he “found that the security NICK’S
was not worth a farthing.” If any one is surprised TALENTS AND
that he should have been so improvident of safety, COURTESY.
they only need be told, that he is one of those men whose
expansiveness of mind, and splendid talents were at home only in
great things. He could grasp at millions, and sport with them as
mere baubles; but the dull and dry detail of investigating trifles, was
fit occupation only for meaner minds. And to the same splendid
talents it was undoubtedly owing, that in passing into retirement, he
was enabled to discover so much soundness and prosperity in the
affairs of the U. S. Bank of Pennsylvania, when nobody else could,
and that the stockholders of that institution, are so permanently
fixed in an investment of a capital of 35 millions, that they will never
get a dollar of it back again.

Nicholas, however, was a man of courtesy, and when on a visit to

this city, condescended to call on the directors of the Morrison. The
money borrowed by them, had been all expended; another
instalment of interest was becoming due, and nothing provided to
pay it with. He met them in the Bank parlor, with solemn faces, and
like the Irishman’s owl, which he feigned to be a parrot, they spoke
not, but were thinking very hard.
“Well gentlemen,” said he, “what is the matter?”
Mr. Faintheart, always foremost to express THE D——L
discouragement, and always the last to aid in relief, NOT TO BE
immediately replied. Why Mr. B. the D——l is to FEARED.
pay, and we have no money. True, answered Nicholas, you owe me a
good deal of money, but if that is all, it can no doubt be arranged.

Mr. Faintheart first blushed, then turned pale, rose from his seat, and
his knees smote together, when he perceived that he had unluckily
hit on the vulgar cognomen of Nicholas. Regarding the man with the
highest veneration, and even with fear, and unable himself to
comprehend, how so great a mind could exist, without more aid
than falls to the lot of common mortals, he secretly believed the
profane allusion to be a verity, and, in his fears, expecting a blow
from the forked tail, that would annihilate him at once, he was fain
to ensconce himself under the table. But perceiving that none of his
comrades attempted to run away, and that Nicholas himself sat in
“the armed chair, calm as a summer’s morning,” with a smile playing
upon his countenance, benignant as benevolence herself, he was re-
assured; the purple came to his nose again, and he stammered out,
I beg pardon, I had no allusion to—to—you, sir.

“Sit down, Mr. Faintheart,” said Nicholas, “your wit amuses us.”

Mr. Faintheart sat down, but was unable sufficiently to master his
disturbed, and mortified feelings, to utter another word.

What further took place at this conference is not

fully known, but it is generally understood that, by EVERY MAN
the advice and influence of Nicholas, it was then HAS HIS
and there agreed, that, to revive the credit of the PRICE.
Morrison, it was best to have a new official organization, and to
select for that purpose men of talent, shrewdness, property and
credit. But as few such could be found who were willing to act,
common scandal has affirmed, that resort was had to the principle,
that “every man has his price,” and that sums as high as ten
thousand dollars were paid, in more than one instance, to procure
the requisite number, all of proper standing.

All this took place soon after the period we are speaking of, and if,
by talent and shrewdness, is meant, the ability to obtain credit,
when none is merited, and to know how to appropriate the avails to
themselves, without incurring liability, with two or three honorable
exceptions, the selection of officers was a judicious one. All these,
however, were minor considerations, and when Nicholas saw his new
friends installed, with a prospect of reviving the credit of the
company, and not only recovering his debt, but obtaining a bolster
also, to support the weary head of his great Pet, his ends were
nearly answered. This was probably one of the encouragements
which led to his famous letter of resignation; and, since the cotton
speculation is now closed, and the commissions all realised, we must
now leave him to enjoy his taste for literature, botany, and
horticulture, on the banks of the Schuylkill.

A semi official statement of the affairs of the

Morrison was now put forth. This also is a plan of FIGURES MAY
modern invention, not called for of yore; and BE MADE TO
whenever it appears gratuitously, is always LIE.
designed to support a false credit. Its success depends on such a
classification, division, and subdivision of the items, that the figures
will express the same things three or four times over. In this instance
it was properly done, and accordingly, the stock and credit of the
company were revived under its influence.

And now, said my friend, it is time I should make

you acquainted with the gentlemen you saw this NEW
morning. The tall, spare, white haired gentleman, CHARACTERS.
with a scooping form, a down-cast look, and a
contriving countenance, is Mr. Bold Eno. You saw that he had an eye
of fire; but it is only a spark struck from a heart of flint, and a
conscience of steel. I have not much to say about him now, but he
will figure by and by. The next is a gentleman of more noble
presence, and less of the look of the d——l about him. His mouth is
full of honied words, but if you believe them all, they will very likely
prove “sweet to the taste but bitter in the belly.” His name is John-
of-the-Field, which signifies that he is a great sportsman; but his
sports are chiefly confined to shooting with the long gun. It has
been said that he could shoot round a corner; but that is a slander
upon the truth, and arose only from his dexterity, in always providing
a corner, round which he can escape, whenever his favorite weapon
throws wide of the mark. He is a lover of the arts, and his soul melts
at the dulcet sounds of music. His politeness, and hospitality, can be
measured only by his love of power; and his opinion of himself is,
that to make him the “greatest and best” man alive, he needs only
to have his own way.

The third gentleman, whose hat was a little

slouched, and his coat not of the most modern cut, A WESTERN
who measured as much space at one stride of his FINANCIER.
legs, as the other gentlemen measured at two,
with his fist doubled in his glove, expressive of his determination,
and his wrath, is Mr. Commissioner —— from the state of ——. In his
early youth, he left the land of his fathers, the land of steady habits,
where he had learned, (what every one learns there,) how to read
and spell correctly, with the rudiments of Pike and Murray, and
fearless and alone, treading the western wilderness in search of a
more genial soil, he established himself, where now a flourishing city
spreads out her next to queenly beauty. Of course he reapt golden
fruits of his toil, and opinions of his sagacity; in short he became,
not unworthily, a great man among them. The principal error of his
mind, was, that being born in comparative poverty, and accustomed
to matter of fact dealing, he regarded the stupendous bubbles of
artificial credit and means, as all real, and the men who managed
them, as all abounding in riches, and as honest as himself. He is now
here, to see how far his improved knowledge of finance may enable
him to correct his past errors of judgment.
Two or three years ago, this same gentleman came
here, by the way of Philadelphia, being there DINING
recommended to the Morrison Kennel Company, PRELIMINARY
, &C.
and others, to negotiate the sale of two millions of
dollars, in state bonds. On his arrival here, he made himself
acquainted with all the characters here introduced, and was “dined”
by each in turn, with a view to sound him as to his wishes and
expectations.

The president of the company having been authorised to open

preliminaries of negotiation with the commissioner, the board of
directors was subsequently called together to hear the report; and,
not to offend the dignity of any one, who may fancy he recognises
his own character, I will abbreviate the names of those present, as
follows, viz: The Prest. and Messrs. J. G. S. T. and W., directors; and
when all had taken their seats, they proceeded to business, as
follows:

Prest. Gentlemen: in conformity with your

DIRECTORS’S request, I have attended to the duty imposed on
WIT AND me, and my report is—that in stubbornness, one
CHAMPAGNE.
Hoosier is more than a match for us all.

“Ha, ha, ha!” all around the board.

G. Well, in cunning how does he stand, Mr. President?

Prest. Simple as a doe.

T. Then he can’t escape, G., if we turn him over to you.

G. Leave off your jokes, T.; this is not a place for them. Mr.
President, what says he to our proposal?

Prest. He likes it not; he wants more ready money than we possess

—he won’t budge.
S. But, if we could get possession of the bonds, I think I could make
a private advance.

T. (aside.) Yes, no doubt, and a private profit, too.

G. Does he object to our credit, Mr. President.

Prest. Not at all; he has the most perfect confidence; that was cared
for at Philadelphia, before he came here.

J. Can’t we come over him with champagne?

T. If we had less sham here, I think it would be

PLANS, HARD better.
NAMES, AND
FLOORING. J. But, it is our only chance; and, if we do not
contrive to raise some active means soon, we
shall never accomplish our designs.

T. And, if you are not quick about it, John-of-the-Field will outwit you
all.

J. What says Mr. Bold Eno?

G. He is throwing cold water, and I suspect he is attacking our

credit.

J. Tell the commissioner, then, that we will mortgage the Kennel to

him.

G. But, it is mortgaged already.

S. No matter, tell him it is only for 700,000, and cost two millions
and a half.

T. Yes, and you can tell what became of some of the money, eh?
S. Mr. T., you are impertinent; do you accuse me sir, at this board.
Sir, you are a scoundrel.

Here Mr. T. brought round his right foot against the leg of the chair
in which Mr. S., who sat next him, was balancing himself with
offended dignity, and knocking it from under him, brought him to the
floor, to the very sensible uneasiness of his crupper, for some time
afterwards.

Mr. S. was a man of courage, in words only; it

AVOID therefore evaporated with what he had last
INDIVIDUAL spoken, and he dared not provoke his assailant
LIABILITY.
any further.

Prest. Gentlemen, I must use my prerogative! order—gentlemen,

order!

W. Gentlemen, this is not what I expected; I must retire.

T. Well, W., I’ll go with you; rogues must bear to be told the truth,
without throwing back an insult, generally the best evidence of their
guilt.

Messrs. W. and T. here retired, and the rest again took their seats.

Prest. Suppose, gentlemen, we send for Nicholas?

G. We are no longer dependant on him, and as it might eventuate in

our individual liability, if we go forward in this thing, it is my opinion
that it is the duty of the president to conduct the negotiation
officially on behalf of the company, and I therefore move a vote of
the board that he be directed accordingly.

Prest. Gentlemen, I have resigned one treasury department, sooner

than do another man’s bidding, and shall decline the service; I like
not this business much.
G. Suppose, then, gentlemen, that we resolve ourselves into a
committee, and call the commissioner in to-morrow.

The proposal being agreed to, the board adjourned, and Mr. S. still
suffering from the contusion of his lower spine, walked home with a
gait much as he would have done, had he chanced to have made a
rent in his nether integuments, on a windy day.

As they went out, J. said, “Mr. G., look well that

BREAKING UP the commissioner has not another interview with
—COLLOQUY. Bold Eno.”

“Never fear,” answered G.; “I have provided employment for him.”

While these things were going on in the parlor of the Morrison, the
commissioner sat in consultation with John-of-the-Field, and the
following colloquy took place between them:

John. Mr. Commissioner, your State is one of

PALAVER. great resources, I think.

Com. Undoubtedly, sir; surpassing that of any other State in the

Union; i. e. when they are fully developed.

John. And you are taking the right course for that purpose. The
example of New-York, in internal improvements, has, by its success,
given an impulse throughout the country. In point of locality, you far
exceed us; but then, we have the capital here, which is our great
advantage.

Com. Yes, sir, if we had the capital of New-York in the State of ——,
to carry out our plans of public improvement, in twenty years from
this, we should exceed her in population by a million.

John. Yes, Mr. Commissioner, and your soil is so fertile, that when
they were completed, even the farmers of New-York would draw
their supplies from you, through the lakes, the Erie canal, and the
Erie railroad, which is now in progress, principally with that view.

Here John, perceiving a stare of surprise and doubt in the eye of the
commissioner, added, “i. e. provided they could turn their lands to a
better account, by the cultivation of products more congenial to the
poverty of their soil.”

Com. The view taken of the subject by our legislature, is, that if we
can borrow the capital now, to complete the works, the income from
them will so increase, with the increase of products and population,
that in twenty years they will pay the debt and interest, and leave a
large revenue to the State ever afterwards.

John. I have always advised my friends to invest in

the securities of the Western States, as being the FEELING HIS
safest they could make; and many of them are WAY.
large capitalists, who frequently ask my advice in
such matters. I think if I had the agency of all the Western States, in
this market, I could save them a great deal of money in their
negotiations. Their debts being only about forty millions, the
management of them would relieve me from too much leisure, which
I find rather irksome, since I retired from the active superintendance
of a large institution here.

Com. Perhaps the best introduction to such an arrangement, would

be, for you to take up the loan now offered.

John. Well, my investments, latterly, have been very large, which I

would not like to disturb at present, but if it would accommodate
you, I would take the matter of 300,000 dollars of the bonds, at par;
but then, I should pay you 200,000 in the stock of the Long Island
Railroad at par, and the balance in six months. This railroad is
estimated to be the most promising stock in the country; and its
friends are sanguine, that when completed, and in operation, the
stock will be worth, at least, 250 per cent., and produce an interest
of, at least, 20 per cent. per annum. The original capital was but
700,000, and with that capital, and the income of the portion
already finished, say only about one-third the distance, the company
has already expended one million and a half—i. e., including a small
debt still outstanding. I would not part with the stock for any other
purpose than to oblige you—being anxious to facilitate the interest
of all the Western States.

Com. My object, sir, is ready funds; they are wanted for immediate
disbursement, to about the amount you speak of; if that could be
arranged, terms could be made for the balance.

John. Well, Mr. Commissioner, if that is the object

with you, I think I could arrange for 300,000, and JOHN
give you Mississippi funds at par. The State would, OVERSHOOTS
HIS MARK.
of course, give me some additional security, and
you would deposite the remainder of the 2,000,000 of the bonds
with me for sale. My commission would be very reasonable.

Com. Why, sir, if the security of the whole State is not good, they
have nothing else to offer. However, I will think of your offer of
Mississippi funds. I have another call to make—so, good morning, sir.

As soon as the commissioner had left, John held this soliloquy with
himself:

“I am afraid I have overshot the mark. That asking

security was a bad affair; but then, I am so in the HIS
habit of asking six or eight for one, hang me if I SOLILOQUY
THEREUPON.
could resist it in this instance. The railroad stock,
too: if he should inquire about it, ‘he’ll smoke me.’ Two hundred
thousand dollars!—why, I’ve got but ten shares; but then, I could
buy the rest at five dollars a share. A small debt they owe, indeed—
ha, ha, ha. I have no doubt those they owe would be glad to make it
small. A promising company!—yes, they promise every thing, and
perform nothing. They will never divide one-twentieth of one per
cent.; and then, the Mississippi funds, too—I wish I had said at a
small discount, instead of par; and then I could have fixed it at one
per cent., which certainly would have been small enough, for I can
buy them at 30 off. Confound my avarice—I’ve made a miss-fire. The
fellow is tame as a spaniel—but then, he’s no fool; how he stared,
when I puffed his State; he didn’t believe me. If he talks about this,
I’m ruined; but then, I’ll deny it all, when he’s gone.”

When the commissioner went out, he proceeded to

the office of Mr. Bottomly, where, upon inquiry, he HIS GAME UP.
learned the truth, viz: that Mississippi funds were
at 30 per cent. discount, and that Long Island Railroad was worth 5
dollars a share, instead of 100. Of course, all further negotiation
with John ceased, and that gentleman was left to wait the arrival of
a commissioner from some state still farther west, with whom his
persuasive flattery, tempered by experience in its use, would be
more effectual. And here I leave him, to the disposal of him, whom
the State has adjudged to possess a wiser head than mine.

The commissioner, when he left the office of Mr.

Bottomly, encountered Mr. G., who, as we have HOW TO
seen, had just left the parlor of the Morrison. G. “COME OVER”
seized him by the arm, and insisted that he should A MAN.
go home and dine with him. It was the intention of the
commissioner, to have gone immediately to Mr. Bold Eno, whose
advice he now began to consider a salvo against the tricks and
managements of others; not in the least imagining that, should he
once get in the clutches of that gentleman, it would be like escaping
from the thievishness of apes, to throw himself into the embrace of
a Bear. Such an interview was precisely what G. desired to prevent,
and consequently, he was persuasive to a degree that common
sense, in polite society, would denominate rudeness. The
commissioner, who, in days gone by, had often urged the weary
traveller to partake of the bounty of his board, or accept the shelter
of his roof, and whose open heart and hand still made the luxuries of
his table the common property of his friends, had no idea of such a
perversion of the rights of hospitality, as a prostitution of them to
the sordid purposes of interest. He was therefore persuaded; and
once at home with G., he was there detained through the day and
evening, and regaled with savory viands—“taste after taste with
kindliest change upheld,” with flowing nectar, dulcet creams, and the
sweet sounds of music, with beauty’s winning smiles, till his brain
whirled with pleasure and delight.

It is not my business to record what further took

place on that day and evening; how many friends GOOD
accidentally came in, to enliven the scene, and EFFECTS OF
how certain gentlemen, who were introduced to HOSPITALITY.
the commissioner, poured flattery into his ear. It is enough that the
deed was done, and in the mind of the commissioner, the character
of Mr. G. was established, as the most disinterested, polite,
gentlemanly, agreeable, hospitable, and honest, of men; and to his
guidance he therefore submitted himself.

On the following morning, agreeably to the preconcerted plan, the

commissioner, having been notified, attended at the office of the
Morrison, at the appointed hour: and, in the mean time, Messrs. G.
and S. having agreed between themselves to raise the amount
immediately wanted by the commissioner, little remained to be done.
Terms were at once agreed upon—two millions of bonds were
deposited in the vaults of the Morrison, and in a few days the
commissioner started for the State of ——, with 250,000 dollars in
good cash, and highly gratified with his reception and success. How
much of the balance the State has ever received, the condition of
their treasury will best explain. Certain it is, that the embarrassment,
which, in more than one instance, has followed similar transactions,
has been severely felt, and will be well remembered by our Western
brethren.

And here—politics aside—a question may well be

asked, which every man can answer for himself, so A WORD
far as opinion goes. Have not the inexperience and ABOUT THE
inefficiency of some of their agents, and the WEST.
villainy and irresponsibility of those with whom
they have negotiated, contributed more to induce a proposition, in
some of the States, to disgrace themselves by repudiating their
debts, than any want of a proper sense of honor among the men of
the West? If such is admitted to be the fact, it may, in some degree,
excuse the rashness of their tempers; but whoever entertains such a
proposition, after a moment’s reflection, would sell—not his own
birth right—but his country and his kin for a morsel of bread.

To the uninitiated, it will seem a singular

circumstance, that a company, without a dollar in THE BUBBLE
their vaults, should undertake to loan two millions; WELL BLOWN.
but they forget the maxim, that credit is the life of
business. This maxim formerly meant, that a credit, well sustained,
gave success to enterprise—but by its misapplication and misuse, it
has come to a different signification, viz: that the more one owes,
the more he has to sport with; and it was precisely on this principle,
that the Morrison Kennel contracted with the commissioner. It was
only carrying out, in another shape, the invention of Nicholas
—“always to be the borrower, when he appeared to lend.” They had
now obtained possession of 2,000,000 of the bonds of the State of
——, by advancing only one eighth part of the amount. On the
remainder, they could borrow largely, and they were now prepared
to wing aloft a new flight; their gaseous inflation borne upward and
onward by the breath of a fame of their own creation; and they were
not long in choosing which way they should direct their course.

Some evil-minded people will perhaps begin to

surmise that Nicholas was the master-moving spirit MYSTERIES
in this matter, and that he was artfully contriving to LEFT
UNEXPLAINE
get his pay of the Morrison at the expense of some D.
one else; perhaps of that nondescript body called
the public. Only one thing, however, is certain; he did get his pay but
as every story has a sequel, this will be found to have one also. If
the Morrison did not lend him the State Bonds to support his own
credit abroad, and then rally their own, by negotiating his bonds,
appearances and common fame, whose poisoned breath we admit is
no standard of truth, were certainly against them. On what principle
matters were arranged between them, is not certainly known; but it
is now matter of history that a system of issuing bonds, certificates,
endorsing and counter endorsing, exchange and re-exchange, was
adopted, by which they drew together a large amount of capital, and
both became proud examples of the splendid results which a single
inventive genius, like Fulton in steam, or Nicholas in finance, may
produce. The affection existing between them was equalled only by
the rivalry shown by each, in offices of kindness to the other.

The little responsibilities of endorsing or

negotiating a million or two of bonds or CHARITY
certificates, were all undertaken and performed as BEGINNING
mere offices of love. What if they were moved to AT HOME.
such kindness by the promise, or the hope, of a reciprocal support?
it but proves the excellence of their hearts, and the nearer approach
to that command of the Master, that we “love our enemies,” and it is
for such things only that we have reward. Their duty was to protect
the commercial interests, of which they each considered themselves
as the head; and, therefore, the first rule of propriety, and the first
law of nature, imposed upon them the necessity of first taking care
of themselves, individually, and their liberality in this respect cannot
be too highly commended. They had a facility for raising money that
other people had not, and duty required that they should enable
their debtors to pay them; and if, in the accommodations necessarily
granted for that purpose, the Morrison should lose a few hundred
thousands, nobody would ever know how it had been lost. The
stockholders, a vagabond race scattered over the face of the earth,
whom nobody knows, would have to bear it; and, (aside,) perhaps
they could screen themselves, and protect the Morrison, by throwing
it all upon Nicholas, should things go wrong. After all, what was such
a loss, compared with the importance and public benefit of
supporting the commercial interest.
It was not to be considered for a moment, and the
wisdom of this conclusion was shortly made A CURE FOR
apparent in its effects. One gentleman is known to HYPO.
have recovered entirely from a nervous
hypochondriacism, so severe, that his cheek blanched at the sight of
his bill-book, and his rotund proportions shrank to the circumference
of an eel, from the self-denial consequent upon poverty—or, what is
more probable, the chagrin of disappointed ambition. But we are
told that “the just shall inherit the earth,” and so it was in this
instance; it having been affirmed, in vindication of this rule, that his
reduced diameter, and assimilation to the animal aforesaid, enabled
him to slide the easier between the sheriff and his conscience; and
while his neighbors, one after another, were tumbling over the
precipice of ruin, he was saved from being knocked down in their
fall, by quietly reposing beneath the shade of the Morrison.

But I am growing too elaborate of description, and must bring this

part of my story to a close. The subjects and material of the picture
I am contemplating, are so numerous, and fruitful of thought, that it
is difficult to decide which to choose, or where to stop; but the
colors are too gross, and weary the eye and the mind. Besides, I
hate to individualise, and must request my readers to bear in mind,
that in the characters here described, nobody whatever is meant;
and should they remember ever to have seen such a character as
either of them, I must beg them to bury the memory of their follies
beneath their more private and superior virtues, should they be
found to have any; and if they are penitent, to throw the mantle of
charity over the past, and screen them from the rude gaze of
scrutiny.

The time was now fast approaching when the

boluses administered by the Kennel physicians, CRISIS
could no longer support the weakened constitution APPROACHIN
G—NICK
of Pennsylvania’s great pet; and what was of more RETIRES.
importance, in the opinion of the Kennel directors,
his credit was no longer sufficient to support them.
Nicholas the first had retired from his charge, to fatten upon his
laurels; Alas! that they should have faded so soon, and left him
nothing but dry leaves whereon to feed his morbid appetite.

Mr. Done-up had succeeded to his place as chief physician; his

patient had already suffered a relapse, and the symptoms were by
no means favorable. His physicians recommended a more generous
diet, but both shores of the Atlantic had already been dredged for
dainties to satisfy his hungry maw. Yet still he grew more rabid, and
would swallow at a gulp, what cost the labor of a year to procure.

In this dilemma, the curs of the Morrison,

perceiving that although they had paid the debt to SKULKING.
Nicholas, contracted under the old direction, they
had, in another shape, just doubled it under the new—that each of
them had served his own purpose, and that not one of them owned
a dollar in the Morrison or the U. S. B. they all resigned their seats,
and scampered away to their dens; where, it is to be hoped, that,
for the benefit of the coming generation, they will live and die in a
good old age. And may God them assoil, the stockholders and
creditors of the Morrison never will.

As I approach the conclusion of this chapter, my tale grows sadder,

and still more sad. All created beings and things, which have
beginning, must also have an end. Death lays his unpitying hand
alike on man, and every monument of his greatness.

We wonder at the duration of our own existence;

that the term of our lives should witness the pride A TOUCH OF
of a powerful state; the successful opposer of the THE
conqueror of armies, and the controller of millions, PATHETIC.
to dwell only in memory. Yet the sad tale has met our ear, that the
U. S. B. of P. is no more.—That pillar of the currency that some
thought was based on a rock of ages—that nursing mother, who
would fain have gathered the whole nation, as a hen gathereth her
brood under her wings—that giant of ubiquity, that dwelt in every
town and city, a very mastodon in power, and a mastiff in
watchfulness—deserted by his friends, persecuted by his enemies,
and cheated by all—expired at Philadelphia on the 5th of Feb. A. D.
1841, after one long and piteous howl of 20 days and 6 hours, in the
5th year of his age.

The power of sympathy is one of the remarkable properties of life,

throughout all animate and inanimate creation; and it is a
circumstance worthy of note, that, at the moment of his expiring,
every inferior cur, from New York to New Orleans, gave one loud and
piercing yell, and laid himself down in his den. Throughout the land,
old men bowed themselves in sorrow, widows wailed in secret,
children began to cry for bread, girls in their teens shed salt tears,
lest the fall of stocks should loose them their sweethearts, and a
thousand and one ancient virgins retired in privacy to count their
rosaries; not with catholic piety and reverence, but to see how much
remained in the stocking.

My readers will be surprised, that I should apply

the incongruous and unmusical name of bull dog to A SHOW OF
the U. S. B.; but I must beg them to recollect that I REASON.
am in Wall-street, that this is Wall-street language,
and that no other would be understood here. And as this was the
name under which we first made his acquaintance, for the sake of
consistency, I must carry out the figure. And to redeem my promise,
it only remains that I should give some account of his disease.

For the benefit of science, and the coming generation, it is to be

regretted that those who have his body in keeping, have refused a
post-mortem examination.[3] Mr. Done-up, the chief physician, insists
that he is not dead, and that he will have him kept for a year and a
day, for experiments of the galvanic battery, electro-magnetic
suspension, and such other inventions and restoratives as his skill
may suggest. But it is the opinion of Wall-street physicians, that the
doctor will never be able to do more than to embalm his body.
It is the opinion, also, of those who have had an
opportunity of observing his symptoms, that his DISEASE—ITS
disease was a species of cholera, a sort of internal NATURE AND
evacuation; or, to be more particular in CAUSE.
explanation, that while promises were going out one way, the specie
was going out the other; and that the too frequent gorging of
stocks, state bonds, cotton speculations, &c., had the effect, like
Major Downing’s elder-bark tea, to work both ways. One singular
circumstance has attended him throughout; which, as it is a reversal
of the order of nature, is worth mentioning. In his case, corruption
took place long before death, and mortification afterwards. His
disease is said to have had its origin in an overheating of the blood,
in the great contest about the deposites. But as this is a matter of
some dispute, I leave the point to be settled by the great successors
of their “illustrious predecessors”—Dr. Done-up of Philadelphia, and
Dr. Ran-down of Kinderhook, both of whom, it is supposed will now
find sufficient leisure, amicably to settle their differences.

My friend, the relator of the foregoing, here left me

abruptly again, but as I am sure to find him in the PRACTICAL
same place to-morrow, I shall not fail to make him HINTS.
a visit. What the subject of his conversation will be,
I cannot tell, but as he has found me a willing listener, I feel sure
that he will be as communicative as ever.

It was my intention, at the close of this chapter, to have expanded a

little on the great value to the public, of corporations, when in the
hands of men of talent, property, and credit, who can, without
responsibility to themselves, act for the benefit of the stockholders.
But, as I have already detained my readers too long, and since
people will think for themselves, right or wrong, if they ever chance
to think at all, I need give no further evidence of their utility, and
entire safety in such hands, than the following facts. Among all the
officers and directors of the Morrison Kennel, and the U. S. B., in
which about forty millions of dollars have been totally sunk and lost,
not one of them ever failed, or lost a dollar in his individual capacity;
and, had the stockholders loaned their capital directly to the
merchants, through whom it will perhaps be said it has been lost,
instead of investing it in those stocks, they would have saved in the
last 25 years, only about 50 millions of dollars, in their expenses,
investments in unconvertible property for their splendid
accommodations, and losses consequent upon their management to
sustain themselves in a ruinous business.

But, since commerce has so organized herself, that

such institutions are necessary, if stockholders will THE WAY OF
not look after their own interest, if they will allow SAFETY.
their agents to pursue their own pleasure without
supervision, or accountability—if they will not employ as their
agents, such men as will have some reference to a good conscience,
and common sense, something besides selfishness in the
performance of their duty to their principals, and the public, they
deserve to lose their money.

I have no wish to appear the censor of the past, or

to make wise saws for the future. But for the WHAT
benefit of those who may have a little money left EXPERIENCE
to invest, I will give them, in few words, the SAYS IT.
ABOUT

experience of thirty years. Every institution,

established for the purpose of creating capital, instead of investing
that already possessed—every one established for the purposes of
speculation, or monopoly, of any kind—or for the promotion of the
interest of any particular individuals—every one which contracts
debts against itself beyond the immediate means of paying, and thus
loses its independence of character—and every one which perverts
its means from the legitimate use, which, on a fair construction, was
contemplated in its creation—either has ruined, or sooner or later
will ruin, itself, its stockholders, and its customers.
CHAPTER IV.

SHOWING HOW STOCKS ARE BOUGHT AND SOLD—HOW

BROKERS GET OUT OF A BAD SPECULATION—HOW
MONEY IS SOMETIMES MADE BY DOING A LOSING
BUSINESS—HOW DISCOUNTS ARE MADE AND OBTAINED,
&C. &C.

Before proceeding to relate the conversation of my friend this day, I

must first state a few facts, for the information of those who are not
already acquainted with them.

The Board of Brokers have many rules for their

government; one of which is—that, when a broker THE LAWLESS
is employed by another person to buy or sell stock HAVE LAWS.
on time, he has the right to give the name of his
principal within twenty-four hours, and then, if the other party is not
satisfied with the security, he is required to deposite ten per cent., or
the contract is cancelled. If the broker so employed, does not give
the name of his principal, he assumes the responsibility himself. All
contracts for the purchase or sale of stocks, on time, are, in
themselves, illegal; the contract cannot be enforced by any law, and
the only security that operators have for their fulfilment, is, that rule
of the board which expels a member, if he fails in his contracts. Just
as all gamblers may be supposed to expel a man from their society,
who takes up his winnings, but never pays his losses: or, upon the
same principle on which there is held to be honor among thieves—
whoever takes more than his share, is expelled from the gang.

Stocks sold on time, are seldom delivered; but, when the contract is
mature, the difference between the sale and the average market
value then, is paid over by the loser.

There are a few men of property, not brokers, who occasionally buy
fancy stocks in Wall-street, when money is scarce, and sell again
when it is more plenty, and reap a profit by it; but their number is so

Full Download Deep Reinforcement Learning in Action 1st Edition Alexander Zai PDF DOCX
100% (1)
Full Download Deep Reinforcement Learning in Action 1st Edition Alexander Zai PDF DOCX
55 pages
Foundations of Deep Reinforcement Learning Theory and Practice in Python (Laura Graesser, Wah Loon Keng) (Z-Library)
100% (1)
Foundations of Deep Reinforcement Learning Theory and Practice in Python (Laura Graesser, Wah Loon Keng) (Z-Library)
413 pages
Python Machine Learning By Example
From Everand
Python Machine Learning By Example
Yuxi (Hayden) Liu
4/5 (7)
Computer Storage Fundamentals: Storage system, storage networking and host connectivity
From Everand
Computer Storage Fundamentals: Storage system, storage networking and host connectivity
Susanta Dutta
No ratings yet
(Addison-Wesley Data & Analytics Series) Laura Graesser - Wah Loon Keng - Foundations of Deep Reinforcement Learning - Theory and Practice in Python-Addison-Wesley Professional (2019) PDF
100% (1)
(Addison-Wesley Data & Analytics Series) Laura Graesser - Wah Loon Keng - Foundations of Deep Reinforcement Learning - Theory and Practice in Python-Addison-Wesley Professional (2019) PDF
656 pages
Evangelism by Fire Reinhard Bonnke PDF
38% (8)
Evangelism by Fire Reinhard Bonnke PDF
2 pages
Vegetarian & Vegan Presentation
No ratings yet
Vegetarian & Vegan Presentation
19 pages
Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser - Instantly access the full ebook content in just a few seconds
100% (2)
Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser - Instantly access the full ebook content in just a few seconds
56 pages
Download full Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser ebook all chapters
100% (5)
Download full Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser ebook all chapters
55 pages
Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser all chapter instant download
100% (21)
Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser all chapter instant download
65 pages
(Ebook) Foundations of Deep Reinforcement Learning: Theory and Practice in Python (Addison-Wesley Data & Analytics Series) by Laura Graesser, Wah Loon Keng ISBN 9780135172384, 0135172381 - Download the ebook and start exploring right away
100% (2)
(Ebook) Foundations of Deep Reinforcement Learning: Theory and Practice in Python (Addison-Wesley Data & Analytics Series) by Laura Graesser, Wah Loon Keng ISBN 9780135172384, 0135172381 - Download the ebook and start exploring right away
72 pages
Foundations of Deep Reinforcement Learning Theory and Practice in Python Addison Wesley Data Analytics Series 1st Edition Laura Graesser Wah Loon Keng - Own the complete ebook with all chapters in PDF format
100% (1)
Foundations of Deep Reinforcement Learning Theory and Practice in Python Addison Wesley Data Analytics Series 1st Edition Laura Graesser Wah Loon Keng - Own the complete ebook with all chapters in PDF format
67 pages
[FREE PDF sample] Deep Reinforcement Learning Aske Plaat ebooks
100% (1)
[FREE PDF sample] Deep Reinforcement Learning Aske Plaat ebooks
65 pages
Deep Reinforcement Learning Aske Plaat pdf download
No ratings yet
Deep Reinforcement Learning Aske Plaat pdf download
52 pages
Get Deep Reinforcement Learning Aske Plaat PDF ebook with Full Chapters Now
100% (1)
Get Deep Reinforcement Learning Aske Plaat PDF ebook with Full Chapters Now
40 pages
Data Science and AI Simplified
From Everand
Data Science and AI Simplified
Ekaaksh Deshpande
No ratings yet
Download full The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu ebook all chapters
100% (2)
Download full The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu ebook all chapters
41 pages
Cloud-Based Machine Learning
From Everand
Cloud-Based Machine Learning
Tanushri Kaniyar
No ratings yet
case
No ratings yet
case
6 pages
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
From Everand
Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning
Margaux Masson-Forsythe
No ratings yet
Designing Machine Learning Systems with Python
From Everand
Designing Machine Learning Systems with Python
David Julian
No ratings yet
Instant download The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu pdf all chapter
100% (4)
Instant download The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu pdf all chapter
66 pages
Designing deep learning systems: Software engineering, #1
From Everand
Designing deep learning systems: Software engineering, #1
rayaan
No ratings yet
Deep Reinforcement Learning: An Essential Guide
From Everand
Deep Reinforcement Learning: An Essential Guide
Robert Johnson
No ratings yet
[FREE PDF sample] (Ebook) Deep Reinforcement Learning by Aske Plaat ISBN 9789811906374, 9811906378 ebooks
100% (7)
[FREE PDF sample] (Ebook) Deep Reinforcement Learning by Aske Plaat ISBN 9789811906374, 9811906378 ebooks
81 pages
Deep Reinforcement Learning in Action 1st Edition Alexander Zai - Quickly download the ebook to read anytime, anywhere
100% (3)
Deep Reinforcement Learning in Action 1st Edition Alexander Zai - Quickly download the ebook to read anytime, anywhere
56 pages
Pattern-Oriented Software Architecture, Patterns for Resource Management
From Everand
Pattern-Oriented Software Architecture, Patterns for Resource Management
Michael Kircher
3.5/5 (2)
User Experience Design: A Practical Playbook to Fuel Business Growth
From Everand
User Experience Design: A Practical Playbook to Fuel Business Growth
Satyam Kantamneni
No ratings yet
Get Grokking Deep Reinforcement Learning First Edition Miguel Morales Free All Chapters
75% (4)
Get Grokking Deep Reinforcement Learning First Edition Miguel Morales Free All Chapters
62 pages
A Beginner's Guide To Deep Reinforcement Learning: Skymind - Ai
No ratings yet
A Beginner's Guide To Deep Reinforcement Learning: Skymind - Ai
23 pages
Deep Learning Vol 2 From Basics to Practice Andrew Glassner - The special ebook edition is available for download now
100% (1)
Deep Learning Vol 2 From Basics to Practice Andrew Glassner - The special ebook edition is available for download now
59 pages
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
Deep Reinforcement Learning Aske Plaat - Experience the full ebook by downloading it now
100% (1)
Deep Reinforcement Learning Aske Plaat - Experience the full ebook by downloading it now
78 pages
VeriSM™ Professional Courseware
From Everand
VeriSM™ Professional Courseware
Helen Morris
No ratings yet
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
From Everand
Mastering Snowflake Platform: Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
Pooja Kelgaonkar
No ratings yet
Deep Learning Vol 1 From Basics to Practice Andrew Glassner - Download the full ebook set with all chapters in PDF format
100% (2)
Deep Learning Vol 1 From Basics to Practice Andrew Glassner - Download the full ebook set with all chapters in PDF format
55 pages
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa - Download the full ebook now to never miss any detail
100% (2)
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa - Download the full ebook now to never miss any detail
74 pages
RL Introduction
No ratings yet
RL Introduction
225 pages
The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu instant download
No ratings yet
The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu instant download
29 pages
Lecture Notes v1.0 687 F22
No ratings yet
Lecture Notes v1.0 687 F22
115 pages
Ecosystems Architecture
From Everand
Ecosystems Architecture
Philip Tetlow
No ratings yet
Uwe Lorenz - Reinforcement Learning From Scratch. Understanding Current Approaches - With Examples in Java and Greenfoot-Springer (2022)
No ratings yet
Uwe Lorenz - Reinforcement Learning From Scratch. Understanding Current Approaches - With Examples in Java and Greenfoot-Springer (2022)
195 pages
DRL Final Notes
No ratings yet
DRL Final Notes
281 pages
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa download pdf
No ratings yet
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa download pdf
86 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
25 pages
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
From Everand
Practical Deep Reinforcement Learning with Python: Concise Implementation of Algorithms, Simplified Maths, and Effective Use of TensorFlow and PyTorch (English Edition)
Ivan Gridin
4/5 (1)
Instant Download Deep Learning Vol 2 From Basics to Practice Andrew Glassner PDF All Chapters
100% (2)
Instant Download Deep Learning Vol 2 From Basics to Practice Andrew Glassner PDF All Chapters
55 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
406 pages
Synthetic Data for Deep Learning: Generate Synthetic Data for Decision Making and Applications with Python and R 1st Edition Necmi Gürsakal download pdf
No ratings yet
Synthetic Data for Deep Learning: Generate Synthetic Data for Decision Making and Applications with Python and R 1st Edition Necmi Gürsakal download pdf
40 pages
Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure
From Everand
Machine Learning Upgrade: A Data Scientist's Guide to MLOps, LLMs, and ML Infrastructure
Kristen Kehrer
No ratings yet
Agile Foundation Courseware – English
From Everand
Agile Foundation Courseware – English
Nader Rad
No ratings yet
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
From Everand
Practical Full Stack Machine Learning: A Guide to Build Reliable, Reusable, and Production-Ready Full Stack ML Solutions
Alok Kumar
No ratings yet
NLP
No ratings yet
NLP
153 pages
The Digital Practitioner Foundation Study Guide
From Everand
The Digital Practitioner Foundation Study Guide
Andrew Josey
No ratings yet
Big Data and Data Science: Analytics for the Future
From Everand
Big Data and Data Science: Analytics for the Future
Dhaanyalakshmi Ahuja
No ratings yet
Reinforcement Learning: From Basics to Expert Proficiency
From Everand
Reinforcement Learning: From Basics to Expert Proficiency
William Smith
No ratings yet
Ai PPT New
No ratings yet
Ai PPT New
14 pages
Deep Reinforcement Learning
100% (1)
Deep Reinforcement Learning
410 pages
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa download
No ratings yet
Learning From Data A short course 1st Edition Yaser S. Abu-Mostafa download
67 pages
The Digital Practitioner Pocket Guide
From Everand
The Digital Practitioner Pocket Guide
Andrew Josey
No ratings yet
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
From Everand
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
Partha Pritam Deka
No ratings yet
Cracking Microservices Interview: Learn Advance Concepts, Patterns, Best Practices, NFRs, Frameworks, Tools and DevOps
From Everand
Cracking Microservices Interview: Learn Advance Concepts, Patterns, Best Practices, NFRs, Frameworks, Tools and DevOps
Sameer S Paradkar
3/5 (1)
Sweet On Officer Mountain Man 1st Edition Flora Madison Download PDF
100% (6)
Sweet On Officer Mountain Man 1st Edition Flora Madison Download PDF
52 pages
Get Computational Approaches For Studying Enzyme Mechanism Part A 1st Edition Gregory A. Voth (Eds.) PDF Ebook With Full Chapters Now
100% (6)
Get Computational Approaches For Studying Enzyme Mechanism Part A 1st Edition Gregory A. Voth (Eds.) PDF Ebook With Full Chapters Now
62 pages
8377buy Ebook Anyone Can Code The Art and Science of Logical Creativity 1st Edition Ali Arya Cheap Price
100% (6)
8377buy Ebook Anyone Can Code The Art and Science of Logical Creativity 1st Edition Ali Arya Cheap Price
34 pages
Full Knight's Absolution (Knights of Hell #5) 1st Edition Sherilee Gray (Gray PDF All Chapters
100% (6)
Full Knight's Absolution (Knights of Hell #5) 1st Edition Sherilee Gray (Gray PDF All Chapters
62 pages
Vedic Maths 1 June 2024(2)
No ratings yet
Vedic Maths 1 June 2024(2)
3 pages
Embedded System Laboratory Instruction Sheet Experiment No. 06 Experiment Name
No ratings yet
Embedded System Laboratory Instruction Sheet Experiment No. 06 Experiment Name
5 pages
Lipid Chemistry: Lecture No 2
No ratings yet
Lipid Chemistry: Lecture No 2
25 pages
A EG3200 Section 4 Config Toolkit (NXPowerLite)
100% (1)
A EG3200 Section 4 Config Toolkit (NXPowerLite)
39 pages
Microbiological Standards For Milk and Milk Produc
No ratings yet
Microbiological Standards For Milk and Milk Produc
6 pages
A Relative Frequency Histogram Uses The Same
No ratings yet
A Relative Frequency Histogram Uses The Same
1 page
5a Volcanoes Part II
No ratings yet
5a Volcanoes Part II
18 pages
DATA-SHEET-F1522-ALARM-CHECK-VALVE
No ratings yet
DATA-SHEET-F1522-ALARM-CHECK-VALVE
1 page
English 12: January 2009 - Form A
No ratings yet
English 12: January 2009 - Form A
1 page
Grade 8 Probability Intro Quiz
0% (1)
Grade 8 Probability Intro Quiz
2 pages
Blending PDF
No ratings yet
Blending PDF
11 pages
Microenvironment Factors.
No ratings yet
Microenvironment Factors.
2 pages
Python 1213213
No ratings yet
Python 1213213
77 pages
Gigabyte_Motherboard_GA-H81M-S2PV-R102
No ratings yet
Gigabyte_Motherboard_GA-H81M-S2PV-R102
5 pages
Manual de Reparacion Glycol-Pump
No ratings yet
Manual de Reparacion Glycol-Pump
34 pages
04_LA312-compressed
No ratings yet
04_LA312-compressed
12 pages
DLL MATATAG_MATHEMATICS 1 Q3_W5
100% (1)
DLL MATATAG_MATHEMATICS 1 Q3_W5
11 pages
123 Emami
No ratings yet
123 Emami
41 pages
ANB2_reading_and_listening_questions
0% (1)
ANB2_reading_and_listening_questions
12 pages
Top 5 ESG Factors for Coca-Cola Company and Financial Impact
No ratings yet
Top 5 ESG Factors for Coca-Cola Company and Financial Impact
3 pages
Tomorrow's Plan - I Am Going To
No ratings yet
Tomorrow's Plan - I Am Going To
4 pages
Prashant Kumar: Personal Information Academics
No ratings yet
Prashant Kumar: Personal Information Academics
1 page
Bus485 100 PDF
No ratings yet
Bus485 100 PDF
27 pages
Number Systems, Operations, and Codes: by Taweesak Reungpeerakul
No ratings yet
Number Systems, Operations, and Codes: by Taweesak Reungpeerakul
22 pages
Unit 1 Short Test 2AB
No ratings yet
Unit 1 Short Test 2AB
2 pages
DBMS Architecture
No ratings yet
DBMS Architecture
7 pages
Risk Appetite Statement 2023 External
100% (1)
Risk Appetite Statement 2023 External
28 pages
14 PDF
No ratings yet
14 PDF
8 pages

Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser 2024 Scribd Download

Uploaded by

Foundations of Deep Reinforcement Learning Theory and Practice in Python 1st Edition Laura Graesser 2024 Scribd Download

Uploaded by

Download the full version of the textbook now at textbookfull.

Foundations of Deep Reinforcement Learning

Explore and download more textbook at https://textbookfull.com

Foundations of Deep Reinforcement Learning Theory and

Deep Reinforcement Learning in Action 1st Edition

Grokking Deep Reinforcement Learning First Edition Miguel

Web Information Systems Engineering WISE 2014 15th

Wisconsin Off the Beaten Path Discover Your Fun 11th

The Giant Vesicle Book 1st Edition Rumiana Dimova (Editor)

Livestock Ration Formulation for Dairy Cattle and Buffalo

A Post WTO International Legal Order Utopian Dystopian and

Visit informit.com/awdataseries for a complete list of available publications.

T he Pearson Addison-Wesley Data & Analytics Series provides readers with

Make sure to connect with us!

Theory and Practice

Boston • Columbus • New York • San Francisco • Amsterdam • Cape Town

For government sales inquiries, please contact [email protected].

Visit us on the Web: informit.com/aw

Library of Congress Control Number: 2019948417

Copyright © 2020 Pearson Education, Inc.

Cover illustration by Wacomka/Shutterstock

SLM Lab is an MIT-licensed open source project.

For my wife Daniela

About the Authors xxvii

1 Introduction to Reinforcement Learning 1

I Policy-Based and Value-Based

3.3 Action Selection in SARSA 65

4 Deep Q-Networks (DQN) 81

5 Improving DQN 103

5.2 Double DQN 106

II Combined Methods 133

6 Advantage Actor-Critic (A2C) 135

6.4.3 Actor-Critic Training Loop 147

7 Proximal Policy Optimization (PPO) 165

8 Parallelization Methods 195

9 Algorithm Summary 205

III Practical Details 207

10 Getting Deep RL to Work 209

10.4.2 Algorithm Performance

11 SLM Lab 239

12 Network Architectures 251

IV Environment Design 287

17 Transition Function 333

A Deep Reinforcement Learning Timeline 343

B Example Environments 345

—Paul Dix, Series Editor

Code 0.1 Installing SLM-Lab from the book git branch

1 # clone the repository

It is curious, as well as amusing, to see how many and how slight

There is a set of proscribed men in Wall-street, who were once

These gentry may be seen, daily, in squads of

As soon as Mr. Friendly had returned to his office

Mr. Spriggins is one of those gentlemen, whose conceit of himself

“Dan, what does all this speculation in the Morrison mean?”

Mr. Friendly, whose polite manners encouraged the freedom of such

Spriggins, who could never imagine the existence

There is another set of men in Wall-street, which

Mr. Friendly continued to purchase largely of the

It ought to be observed here, that these purchases were generally

So long as the stock maintained the very high price

Mr. Friendly now saw all his hopes about to be

The directors of the Morrison rubbed their hands

The proposal was hailed with delight, and Mr. Hold-

This cunning device of Nicholas, always to be the borrower, when he

By these master strokes of policy, Nicholas came

My venerable friend here excused himself on account of fatigue, but

If any one is curious to know what became of

If the mischiefs arising from this species of

The business is showy and fascinating, its

STATE STOCKS—HISTORY OF THE MORRISON KENNEL

Among the many schemes of finance that have

But, until they can learn to take better care of their

“Before,” said he, “I proceed to the story of the

Six months from the time when Nicholas made the

Nicholas, however, was a man of courtesy, and when on a visit to

What further took place at this conference is not