The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu - Download the full ebook now for a seamless reading experience
The Art of Reinforcement Learning: Fundamentals, Mathematics, and Implementations with Python 1st Edition Michael Hu - Download the full ebook now for a seamless reading experience
com
https://ebookmeta.com/product/the-art-of-reinforcement-
learning-fundamentals-mathematics-and-implementations-with-
python-1st-edition-michael-hu-2/
OR CLICK HERE
DOWLOAD EBOOK
https://ebookmeta.com/product/deep-reinforcement-learning-with-python-
with-pytorch-tensorflow-and-openai-gym-1st-edition-nimish-sanghi-3/
ebookmeta.com
https://ebookmeta.com/product/the-sounds-of-mandarin-1st-edition-
janet-y-chen/
ebookmeta.com
Mechanical Engineering Capsule 1st Edition Youth
Competition Editorial Board
https://ebookmeta.com/product/mechanical-engineering-capsule-1st-
edition-youth-competition-editorial-board/
ebookmeta.com
https://ebookmeta.com/product/heres-to-us-what-if-its-us-2-1st-
edition-becky-albertalli-2/
ebookmeta.com
https://ebookmeta.com/product/comparative-psychology-evolution-and-
development-of-brain-and-behavior-3rd-edition-mauricio-r-papini/
ebookmeta.com
Michael Hu
© Michael Hu 2023
Apress Standard
The publisher, the authors, and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
Source Code
You can download the source code used in this book from github.com/
apress/art-of-reinforcement-lear ning.
Michael Hu
Any source code or other supplementary material referenced by the
author in this book is available to readers on GitHub (https://github.
com/Apress). For more detailed information, please visit https://www.
apress.com/gp/services/source-code.
Contents
Part I Foundation
1 Introduction
1.1 AI Breakthrough in Games
1.2 What Is Reinforcement Learning
1.3 Agent-Environment in Reinforcement Learning
1.4 Examples of Reinforcement Learning
1.5 Common Terms in Reinforcement Learning
1.6 Why Study Reinforcement Learning
1.7 The Challenges in Reinforcement Learning
1.8 Summary
References
2 Markov Decision Processes
2.1 Overview of MDP
2.2 Model Reinforcement Learning Problem Using MDP
2.3 Markov Process or Markov Chain
2.4 Markov Reward Process
2.5 Markov Decision Process
2.6 Alternative Bellman Equations for Value Functions
2.7 Optimal Policy and Optimal Value Functions
2.8 Summary
References
3 Dynamic Programming
3.1 Use DP to Solve MRP Problem
3.2 Policy Evaluation
3.3 Policy Improvement
3.4 Policy Iteration
3.5 General Policy Iteration
3.6 Value Iteration
3.7 Summary
References
4 Monte Carlo Methods
4.1 Monte Carlo Policy Evaluation
4.2 Incremental Update
4.3 Exploration vs.Exploitation
4.4 Monte Carlo Control (Policy Improvement)
4.5 Summary
References
5 Temporal Difference Learning
5.1 Temporal Difference Learning
5.2 Temporal Difference Policy Evaluation
5.3 Simplified 𝜖-Greedy Policy for Exploration
5.4 TD Control—SARSA
5.5 On-Policy vs.Off-Policy
5.6 Q-Learning
5.7 Double Q-Learning
5.8 N-Step Bootstrapping
5.9 Summary
References
Part II Value Function Approximation
6 Linear Value Function Approximation
6.1 The Challenge of Large-Scale MDPs
6.2 Value Function Approximation
6.3 Stochastic Gradient Descent
6.4 Linear Value Function Approximation
6.5 Summary
References
7 Nonlinear Value Function Approximation
7.1 Neural Networks
7.2 Training Neural Networks
7.3 Policy Evaluation with Neural Networks
7.4 Naive Deep Q-Learning
7.5 Deep Q-Learning with Experience Replay and Target
Network
7.6 DQN for Atari Games
7.7 Summary
References
8 Improvements to DQN
8.1 DQN with Double Q-Learning
8.2 Prioritized Experience Replay
8.3 Advantage function and Dueling Network Architecture
8.4 Summary
References
Part III Policy Approximation
9 Policy Gradient Methods
9.1 Policy-Based Methods
9.2 Policy Gradient
9.3 REINFORCE
9.4 REINFORCE with Baseline
9.5 Actor-Critic
9.6 Using Entropy to Encourage Exploration
9.7 Summary
References
10 Problems with Continuous Action Space
10.1 The Challenges of Problems with Continuous Action Space
10.2 MuJoCo Environments
10.3 Policy Gradient for Problems with Continuous Action
Space
10.4 Summary
References
11 Advanced Policy Gradient Methods
11.1 Problems with the Standard Policy Gradient Methods
11.2 Policy Performance Bounds
11.3 Proximal Policy Optimization
11.4 Summary
References
Part IV Advanced Topics
12 Distributed Reinforcement Learning
12.1 Why Use Distributed Reinforcement Learning
12.2 General Distributed Reinforcement Learning Architecture
12.3 Data Parallelism for Distributed Reinforcement Learning
12.4 Summary
References
13 Curiosity-Driven Exploration
13.1 Hard-to-Explore Problems vs.Sparse Reward Problems
13.2 Curiosity-Driven Exploration
13.3 Random Network Distillation
13.4 Summary
References
14 Planning with a Model:AlphaZero
14.1 Why We Need to Plan in Reinforcement Learning
14.2 Monte Carlo Tree Search
14.3 AlphaZero
14.4 Training AlphaZero on a 9 × 9 Go Board
14.5 Training AlphaZero on a 13 × 13 Gomoku Board
14.6 Summary
References
Index
About the Author
Michael Hu
is an exceptional software engineer with a wealth of
expertise spanning over a decade, specializing in the
design and implementation of enterprise-level
applications. His current focus revolves around leveraging
the power of machine learning (ML) and artificial
intelligence (AI) to revolutionize operational systems
within enterprises. A true coding enthusiast, Michael finds
solace in the realms of mathematics and continuously
explores cutting-edge technologies, particularly machine learning and
deep learning. His unwavering passion lies in the realm of deep
reinforcement learning, where he constantly seeks to push the
boundaries of knowledge. Demonstrating his commitment to the field,
he has built various numerous open source projects on GitHub that
closely emulate state-of-the-art reinforcement learning algorithms
pioneered by DeepMind, including notable examples like AlphaZero,
MuZero, and Agent57. Through these projects, Michael demonstrates
his commitment to advancing the field and sharing his knowledge with
fellow enthusiasts. He currently resides in the city of Shanghai, China.
About the Technical Reviewer
Shovon Sengupta
has over 14 years of expertise and a deepened
understanding of advanced predictive analytics, machine
learning, deep learning, and reinforcement learning. He
has established a place for himself by creating innovative
financial solutions that have won numerous awards. He is
currently working for one of the leading multinational
financial services corporations in the United States as the
Principal Data Scientist at the AI Center of Excellence. His job entails
leading innovative initiatives that rely on artificial intelligence to
address challenging business problems. He has a US patent (United
States Patent: Sengupta et al.: Automated Predictive Call Routing Using
Reinforcement Learning [US 10,356,244 B1]) to his credit. He is also a
Ph.D. scholar at BITS Pilani. He has reviewed quite a few popular titles
from leading publishers like Packt and Apress and has also authored a
few courses for Packt and CodeRed (EC-Council) in the realm of
machine learning. Apart from that, he has presented at various
international conferences on machine learning, time series forecasting,
and building trustworthy AI. His primary research is concentrated on
deep reinforcement learning, deep learning, natural language
processing (NLP), knowledge graph, causality analysis, and time series
analysis. For more details about Shovon’s work, please check out his
LinkedIn page: www.linkedin.com/in/shovon-sengupta-272aa917.
Part I
Foundation
© The Author(s), under exclusive license to APress Media, LLC, part of Springer
Nature 2023
M. Hu, The Art of Reinforcement Learning
https://doi.org/10.1007/978-1-4842-9606-6_1
1. Introduction
Michael Hu1
(1) Shanghai, Shanghai, China
Fig. 1.1 A DQN agent learning to play Atari’s Breakout. The goal of the game is to
use a paddle to bounce a ball up and break through a wall of bricks. The agent only
takes in the raw pixels from the screen, and it has to figure out what’s the right action
to take in order to maximize the score. Idea adapted from Mnih et al. [1]. Game
owned by Atari Interactive, Inc.
Go
Go is an ancient Chinese strategy board game played by two players,
who take turns laying pieces of stones on a 19x19 board with the goal
of surrounding more territory than the opponent. Each player has a set
of black or white stones, and the game begins with an empty board.
Players alternate placing stones on the board, with the black player
going first.
Fig. 1.2 Yoda Norimoto (black) vs. Kiyonari Tetsuya (white), Go game from the 66th
NHK Cup, 2018. White won by 0.5 points. Game record from CWI [4]
The stones are placed on the intersections of the lines on the board,
rather than in the squares. Once a stone is placed on the board, it
cannot be moved, but it can be captured by the opponent if it is
completely surrounded by their stones. Stones that are surrounded and
captured are removed from the board.
The game continues until both players pass, at which point the
territory on the board is counted. A player’s territory is the set of empty
intersections that are completely surrounded by their stones, plus any
captured stones. The player with the larger territory wins the game. In
the case of the final board position shown in Fig. 1.2, the white won by
0.5 points.
Although the rules of the game are relatively simple, the game is
extremely complex. For instance, the number of legal board positions in
Go is enormously large compared to Chess. According to research by
Tromp and Farnebä ck [3], the number of legal board positions in Go is
approximately , which is vastly greater than the number of
atoms in the universe.
This complexity presents a significant challenge for artificial
intelligence (AI) agents that attempt to play Go. In March 2016, an AI
agent called AlphaGo developed by Silver et al. [5] from DeepMind
made history by beating the legendary Korean player Lee Sedol with a
score of 4-1 in Go. Lee Sedol is a winner of 18 world titles and is
considered one of the greatest Go player of the past decade. AlphaGo’s
victory was remarkable because it used a combination of deep neural
networks and tree search algorithms, as well as the technique of
reinforcement learning.
AlphaGo was trained using a combination of supervised learning
from human expert games and reinforcement learning from games of
self-play. This training enabled the agent to develop creative and
innovative moves that surprised both Lee Sedol and the Go community.
The success of AlphaGo has sparked renewed interest in the field of
reinforcement learning and has demonstrated the potential for AI to
solve complex problems that were once thought to be the exclusive
domain of human intelligence. One year later, Silver et al. [6] from
DeepMind introduced a new and more powerful agent, AlphaGo Zero.
AlphaGo Zero was trained using pure self-play, without any human
expert moves in its training, achieving a higher level of play than the
previous AlphaGo agent. They also made other improvements like
simplifying the training processes.
To evaluate the performance of the new agent, they set it to play
games against the exact same AlphaGo agent that beat the world
champion Lee Sedol in 2016, and this time the new AlphaGo Zero beats
AlphaGo with score 100-0.
In the following year, Schrittwieser et al. [7] from DeepMind
generalized the AlphaGo Zero agent to play not only Go but also other
board games like Chess and Shogi (Japanese chess), and they called this
generalized agent AlphaZero. AlphaZero is a more general
reinforcement learning algorithm that can be applied to a variety of
board games, not just Go, Chess, and Shogi.
Reinforcement learning is a type of machine learning in which an
agent learns to make decisions based on the feedback it receives from
its environment. Both DQN and AlphaGo (and its successor) agents use
this technique, and their achievements are very impressive. Although
these agents are designed to play games, this does not mean that
reinforcement learning is only capable of playing games. In fact, there
are many more challenging problems in the real world, such as
navigating a robot, driving an autonomous car, and automating web
advertising. Games are relatively easy to simulate and implement
compared to these other real-world problems, but reinforcement
learning has the potential to be applied to a wide range of complex
challenges beyond game playing.
Environment
The environment is the world in which the agent operates. It can be a
physical system, such as a robot navigating a maze, or a virtual
environment, such as a game or a simulation. The environment
provides the agent with two pieces of information: the state of the
environment and a reward signal. The state describes the relevant
information about the environment that the agent needs to make a
decision, such as the position of the robot or the cards in a poker game.
The reward signal is a scalar value that indicates how well the agent is
doing in its task. The agent’s objective is to maximize its cumulative
reward over time.
The environment has its own set of rules, which determine how the
state and reward signal change based on the agent’s actions. These
rules are often called the dynamics of the environment. In many cases,
the agent does not have access to the underlying dynamics of the
environment and must learn them through trial and error. This is
similar to how we humans interact with the physical world every day,
normally we have a pretty good sense of what’s going on around us, but
it’s difficult to fully understand the dynamics of the universe.
Game environments are a popular choice for reinforcement learning
because they provide a clear objective and well-defined rules. For
example, a reinforcement learning agent could learn to play the game of
Pong by observing the screen and receiving a reward signal based on
whether it wins or loses the game.
In a robotic environment, the agent is a robot that must learn to
navigate a physical space or perform a task. For example, a
reinforcement learning agent could learn to navigate a maze by using
sensors to detect its surroundings and receiving a reward signal based
on how quickly it reaches the end of the maze.
State
In reinforcement learning, an environment state or simply state is the
statistical data provided by the environment to represent the current
state of the environment. The state can be discrete or continuous. For
instance, when driving a stick shift car, the speed of the car is a
continuous variable, while the current gear is a discrete variable.
Ideally, the environment state should contain all relevant
information that’s necessary for the agent to make decisions. For
example, in a single-player video game like Breakout, the pixels of
frames of the game contain all the information necessary for the agent
to make a decision. Similarly, in an autonomous driving scenario, the
sensor data from the car’s cameras, lidar, and other sensors provide
relevant information about the surrounding environment.
However, in practice, the available information may depend on the
task and domain. In a two-player board game like Go, for instance,
although we have perfect information about the board position, we
don’t have perfect knowledge about the opponent player, such as what
they are thinking in their head or what their next move will be. This
makes the state representation more challenging in such scenarios.
Furthermore, the environment state might also include noisy data.
For example, a reinforcement learning agent driving an autonomous car
might use multiple cameras at different angles to capture images of the
surrounding area. Suppose the car is driving near a park on a windy
day. In that case, the onboard cameras could also capture images of
some trees in the park that are swaying in the wind. Since the
movement of these trees should not affect the agent’s ability to drive,
because the trees are inside the park and not on the road or near the
road, we can consider these movements of the trees as noise to the self-
driving agent. However, it can be challenging to ignore them from the
captured images. To tackle this problem, researchers might use various
techniques such as filtering and smoothing to eliminate the noisy data
and obtain a cleaner representation of the environment state.
Reward
In reinforcement learning, the reward signal is a numerical value that
the environment provides to the agent after the agent takes some
action. The reward can be any numerical value, positive, negative, or
zero. However, in practice, the reward function often varies from task to
task, and we need to carefully design a reward function that is specific
to our reinforcement learning problem.
Designing an appropriate reward function is crucial for the success
of the agent. The reward function should be designed to encourage the
agent to take actions that will ultimately lead to achieving our desired
goal. For example, in the game of Go, the reward is 0 at every step
before the game is over, and +1 or if the agent wins or loses the
game, respectively. This design incentivizes the agent to win the game,
without explicitly telling it how to win.
Similarly, in the game of Breakout, the reward can be a positive
number if the agent destroys some bricks negative number if the agent
failed to catch the ball, and zero reward otherwise. This design
incentivizes the agent to destroy as many bricks as possible while
avoiding losing the ball, without explicitly telling it how to achieve a
high score.
The reward function plays a crucial role in the reinforcement
learning process. The goal of the agent is to maximize the accumulated
rewards over time. By optimizing the reward function, we can guide the
agent to learn a policy that will achieve our desired goal. Without the
reward signal, the agent would not know what the goal is and would
not be able to learn effectively.
In summary, the reward signal is a key component of reinforcement
learning that incentivizes the agent to take actions that ultimately lead
to achieving the desired goal. By carefully designing the reward
function, we can guide the agent to learn an optimal policy.
Agent
In reinforcement learning, an agent is an entity that interacts with an
environment by making decisions based on the received state and
reward signal from the environment. The agent’s goal is to maximize its
cumulative reward in the long run. The agent must learn to make the
best decisions by trial and error, which involves exploring different
actions and observing the resulting rewards.
In addition to the external interactions with the environment, the
agent may also has its internal state represents its knowledge about the
world. This internal state can include things like memory of past
experiences and learned strategies.
It’s important to distinguish the agent’s internal state from the
environment state. The environment state represents the current state
of the world that the agent is trying to influence through its actions. The
agent, however, has no direct control over the environment state. It can
only affect the environment state by taking actions and observing the
resulting changes in the environment. For example, if the agent is
playing a game, the environment state might include the current
positions of game pieces, while the agent’s internal state might include
the memory of past moves and the strategies it has learned.
In this book, we will typically use the term “state” to refer to the
environment state. However, it’s important to keep in mind the
distinction between the agent’s internal state and the environment
state. By understanding the role of the agent and its interactions with
the environment, we can better understand the principles behind
reinforcement learning algorithms. It is worth noting that the terms
“agent” and “algorithm” are frequently used interchangeably in this
book, particularly in later chapters.
Action
In reinforcement learning, the agent interacts with an environment by
selecting actions that affect the state of the environment. Actions are
chosen from a predefined set of possibilities, which are specific to each
problem. For example, in the game of Breakout, the agent can choose to
move the paddle to the left or right or take no action. It cannot perform
actions like jumping or rolling over. In contrast, in the game of Pong, the
agent can choose to move the paddle up or down but not left or right.
The chosen action affects the future state of the environment. The
agent’s current action may have long-term consequences, meaning that
it will affect the environment’s states and rewards for many future time
steps, not just the next immediate stage of the process.
Actions can be either discrete or continuous. In problems with
discrete actions, the set of possible actions is finite and well defined.
Examples of such problems include Atari and Go board games. In
contrast, problems with continuous actions have an infinite set of
possible actions, often within a continuous range of values. An example
of a problem with continuous actions is robotic control, where the
degree of angle movement of a robot arm is often a continuous action.
Reinforcement learning problems with discrete actions are
generally easier to solve than those with continuous actions. Therefore,
this book will focus on solving reinforcement learning problems with
discrete actions. However, many of the concepts and techniques
discussed in this book can be applied to problems with continuous
actions as well.
Policy
A policy is a key concept in reinforcement learning that defines the
behavior of an agent. In particular, it maps each possible state in the
environment to the probabilities of chose different actions. By
specifying how the agent should behave, a policy guides the agent to
interact with its environment and maximize its cumulative reward. We
will delve into the details of policies and how they interact with the
MDP framework in Chap. 2.
For example, suppose an agent is navigating a grid-world
environment. A simple policy might dictate that the agent should
always move to the right until it reaches the goal location. Alternatively,
a more sophisticated policy could specify that the agent should choose
its actions based on its current position and the probabilities of moving
to different neighboring states.
Model
In reinforcement learning, a model refers to a mathematical description
of the dynamics function and reward function of the environment. The
dynamics function describes how the environment evolves from one
state to another, while the reward function specifies the reward that the
agent receives for taking certain actions in certain states.
In many cases, the agent does not have access to a perfect model of
the environment. This makes learning a good policy challenging, since
the agent must learn from experience how to interact with the
environment to maximize its reward. However, there are some cases
where a perfect model is available. For example, if the agent is playing a
game with fixed rules and known outcomes, the agent can use this
knowledge to select its actions strategically. We will explore this
scenario in detail in Chap. 2.
In reinforcement learning, the agent-environment boundary can be
ambiguous. Despite a house cleaning robot appearing to be a single
agent, the agent’s direct control typically defines its boundary, while the
remaining components comprise the environment. In this case, the
robot’s wheels and other hardwares are considered to be part of the
environment since they aren’t directly controlled by the agent. We can
think of the robot as a complex system composed of several parts, such
as hardware, software, and the reinforcement learning agent, which can
control the robot’s movement by signaling the software interface, which
then communicates with microchips to manage the wheel movement.
Autonomous Driving
Reinforcement learning can be used to train autonomous vehicles to
navigate complex and unpredictable environments. The goal for the
agent is to safely and efficiently drive the vehicle to a desired location
while adhering to traffic rules and regulations. The reward signal could
be a positive number for successful arrival at the destination within a
specified time frame and a negative number for any accidents or
violations of traffic rules. The environment state could contain
information about the vehicle’s location, velocity, and orientation, as
well as sensory data such as camera feeds and radar readings.
Additionally, the state could include the current traffic conditions and
weather, which would help the agent to make better decisions while
driving.
REVIEW.
6. How are the claims of justice and mercy balanced in the atonement?
8. What can you say of the love of God as it appears in the atonement?
9. What is meant by individual salvation?
11. By what consideration does mercy mitigate the claims of justice in the
plan of redemption?
13. For what several purposes did Messiah institute his church?
16. What other officers did Jesus call to the ministry upon whom he
bestowed similar powers?
20. What reasons can you give for believing that the church as organized by
Messiah is to be perpetuated?
21. What are the four leading opinions in respect to church government?
(Note 5).
Footnotes
2. Micah v: 2.
3. Luke 1:28-38.
5. Matt. ii: 2.
11. I have condensed much of the matter in the first part of this section from
the learned works of D'Aubigne, Dr. Mosheim, Gibbon and Josephus,
sometimes using even their phraseology without further acknowledgement
than this note.—The Author.
15. See "The First Gospel of the Infancy," Apocryphal New Testament
(Colley & Rich, publishers, Boston, 1891.)
16. Luke i.
31. Matt. x.
33. Luke x.
34. John v.
54. Luke alone calls it Calvary; Matthew, Mark and John call it Golgotha.
They each have reference to the same place, which was known by the two
different names.
56. Those predictions are found in the following passages: John ii:18-22;
x:17, 18; xiii:31-33. Matt. xii:38=42; xvi:21-23; xvii:1-9; Mark ix:30-32;
x:32-34.
69. Acts i.
79. The land Bountiful was in the northern part of South America.
82. Let those who would be more minutely informed upon the ministry of
Messiah on the western hemisphere, study carefully the book of III Nephi,
where the history of that important event is recorded, and which book has
been called—a "Fifth Gospel."
83. It must be remembered, that Jesus told the Nephites that he was going to
visit the lost tribes whom the Father had led away. They, too, were to have
the gospel preached to them (III Nephi xv and xvi.)
84. In his "Comment de Rebus Christ," p. 78-80, the learned Dr. Mosheim
has a note on this passage in which his aim is to prove that the correct
translation from the Greek of the phrase usually rendered "they gave forth
their lots," should be "they gave their votes." While it is but proper to say
that the Doctor's translation is very generally rejected by the learned, still
there will be no question with those who understand the order of the
priesthood and the manner of filling vacancies in its quorums, that Dr.
Mosheim is correct in his interpretation as to the meaning of the passage.
86. Pentecost came fifty days after the Passover, on which day the Lord
Jesus was crucified. Allowing that he lay three days in the tomb and was
with his disciples forty days after his resurrection (Acts i:3), forty-three
days of the fifty between Passover and Pentecost are accounted for, leaving
but seven days between ascension and the day of Pentecost, when the
promise of the baptism of the spirit was fulfilled.—"The Gospel," note p.
177.
88. The languages spoken are enumerated by the writer of The Acts ii:9-11.
90. I think it proper here to call the attention of the student to the fact that
the principles of the gospel in this discourse of Peter's are stated in the same
order that they were unfolded in the ministry of John the Baptist and
Messiah. First, John came bearing witness of one who should come after
him—Christ, the Lord. Hence, he taught faith in God (John i:15, 16, also
verses 19-36). After that, the burden of his message was, "Repent, for the
kingdom of heaven is at hand;" then followed his baptism in water with a
promise that they should receive the Holy Ghost. So Peter first taught the
people faith in the Lord, proving from the scripture that Jesus was both
Lord and Christ; and when they believed that, then he taught them
repentance and baptism for the remission of sins, and promised them the
Holy Ghost.
97. Acts viii. The student will observe that the same order of presenting and
accepting the gospel is observed in the account given of its introduction into
Samaria as was observed in the teaching of John the Baptist and Jesus, and
also of Peter, on the day of Pentecost.
98. Acts ix.
109. Mormon ix:12, 13. Other evidences from the Nephite scriptures will be
found in Alma xi:40-44. III Nephi xxvii:13-15. II Nephi ii. Mosiah xv:18-
27. Alma xxxiv:7-17. Alma xiii:1-26.
119. Doc. and Cov., sec. xix:16-18. See also Mosiah iii:20, 21. "The
Gospel," Roberts, page 29.
126. The injunction placed upon those who accept the faith of the gospel is
that they add to their faith virtue; and to virtue, knowledge; and to
knowledge, temperance; and to temperance, patience; and to patience,
godliness; and to godliness, brotherly kindness; and to brotherly kindness,
charity. For if these things be in you, and abound they make you that ye
shall neither be barren nor unfruitful in the knowledge of our Lord Jesus
Christ. (II Peter i:5-8.)—"The Gospel," page 37.
THE APOSTASY.
SECTION I.
In Part I, our narrative was confined mainly to those propitious
circumstances which made for the successful introduction of the gospel and
the founding of the church of Christ. In Part II, we are to deal with those
adverse events which led finally to the subversion of the Christian religion.
We commence with the
2. Two special reasons may be assigned for the persecution of the saints by
the Jews. 1. They looked upon Christianity as a rival religion to Judaism, a
thing of itself sufficient to engender bitterness, jealousy, persecution. 2. If
Christianity should live and obtain a respectable standing, the Jews of that
generation must ever be looked upon as not only putting an innocent man to
death, but as rejecting and slaying the Son of God. To crush this rival
religion and escape the odium which the successful establishment of it
would inevitably fix upon them, were the incentives which prompted that
first general persecution which arose against the church in Jerusalem, and
that commenced in the very first year after Messiah's ascension.
3. The extent of the persecution or the time of its continuance may not be
determined; but that it was murderous may be learned from the fact that
Stephen was slain,[2] as was also James, the son of Zebedee,[3] and James,
the Just, brother of the Lord.[4] The Apostle Peter was imprisoned and
would doubtless have shared the fate of the other martyrs, but that he was
delivered by an angel.[5]
4. Nor was this persecution confined alone to Jerusalem; on the contrary the
hate-blinded high priests and elders of the Jews in Palestine conferred with
the Jews throughout the Roman provinces, and everywhere incited them to
hatred of the Christians, exhorting them to have no connection with, and to
do all in their power to destroy the "superstition," as the Christian religion
was then called. Nor were they content with what they themselves could do,
but they exhausted their ingenuity in efforts to incite the Romans against
them. To accomplish this they charged that the Christians had treasonable
designs against the Roman government, as "appeared by their
acknowledging as their king one Jesus, a malefactor whom Pilate had most
justly put to death."[6]
5. The Jews themselves, however, were in no great favor with the Romans
since their impatience of Roman restraint led them to be constantly on the
eve of rebellion and sedition, and frequently to break out into deeds of
violence against the Roman authority. This lack of favor rendered the power
of the Jews unequal to their malice against the church of Christ.
6. The imperious nation, too, whose forefathers had rejected the prophets
and at the last had crucified the Son of God with every circumstance of
cruelty, crying out in the streets of their holy city, "crucify him, and let his
blood be upon us and on our children,"[7] were about to meet the calamities
which their wickedness called down upon them. The Roman emperor
Vespasian [Ves-pa-zhe-an], tired of their repeated seditions, at last sent an
army under Titus to subjugate them. The Jews made a stubborn resistance
and a terrible war followed. Jerusalem, crowded with people who had come
into the city from the surrounding country to attend the Passover, was
besieged for six months, during which time more than a million of her
wretched inhabitants perished of famine. The city was finally taken, the
walls thereof thrown down and the temple so completely destroyed that not
one stone was left upon another. Thousands of Jews were cut to pieces and
nearly a hundred thousand of those taken captive were sent into slavery.[8]
All the calamities predicted by the Messiah[9] befell the city and people.
Jerusalem from that time until now has been trodden down of the Gentiles;
and will be until the times of the Gentiles are fulfilled.
7. According to Eusebius, the Christians escaped these calamities which
befell the Jews; for the whole body of the church at Jerusalem, having been
commanded by divine revelation, given to men of approved piety, removed
from Jerusalem before the war and dwelt at Pella, beyond Jordan, where
they were secure from the calamities of those times.[10]
Rome gradually became the common temple of her subjects; and the
freedom of the city was bestowed on all the gods of mankind.[11]
9. The student who would learn why the mild and beautiful Christian
religion was alone selected to bear the wrath and feel the vengeful power of
Rome, must look deeper than the reasons usually assigned for the strange
circumstance. It is superficial to say that the persecution was caused by the
charges of immorality. The Roman authorities had the best of evidence that
the charges were false. (See note 1, end of section). Equally absurd is it to
assign as a cause the supposed atheism of the Christians, for that was the
condition of nearly all Rome; while the charge that they were traitors to the
emperor, and expected to see the empire supplanted by the kingdom of
Christ—which some assign as the chief cause of Roman persecution—was
treated with contempt by the emperors. (See note 2, end of section).
10. The true cause of the persecution was this: Satan knew there was no
power of salvation in the idolatrous worship of the heathen, and hence let
them live on in peace, but when Jesus of Nazareth and his followers came,
in the authority of God, preaching the gospel, he recognized in that the
principles and power against which he had rebelled in heaven, and stirred
up the hearts of men to rebellion against the truth to overthrow it. This was
the real cause of persecution, though it lurked under a variety of pretexts,
the most of which are named in the above supposed causes.
11. The First Roman Persecution.—The first emperor to enact laws for
the extermination of Christians was Nero. (See note 3, end of section). His
decrees against them originated rather in an effort to shield himself from
popular fury than any desire that he had to protect the religion of the State
against the advancement of Christianity. Nero, wishing to witness a great
conflagration, had set fire to the city of Rome. The flames utterly consumed
three of the fourteen wards into which the city was divided, and spread ruin
in seven others. It was in vain that the emperor tried to soothe the indignant
and miserable citizens whose all had been consumed by the flames, and
neither the magnificence of the prince, nor his attempted expiation of the
gods could remove from him the infamy of having ordered the
conflagration.
12. Therefore, [writes Tacitus, one of the most trustworthy of all historians],
to stop the clamor Nero falsely accused and subjugated to the most
exquisite punishments a people hated for their crimes, called Christians.
The founder of the sect, Christ, was executed in the reign of Tiberius, by the
Procurator Pontius Pilate. The pernicious superstition, repressed for a time,
burst forth again; not only through Judea, the birth-place of the evil, but at
Rome also, where everything atrocious and base centers and is in repute.
Those first seized, confessed; then a vast multitude, detected by their
means, were convicted, not so much of the crime of burning the city as of
hatred of mankind. And insult was added to their torments; for being clad in
skins of wild beasts they were torn to pieces by dogs; or affixed to crosses
to be burned, were used as lights to dispel the darkness of night, when the
day was gone. Nero devoted his garden to the show, and held circensian
[sir-sen-shan] games, mixing with the rabble, or mounting a chariot, clad
like a coachman. Hence, though the guilty and those meriting the severest
punishment, suffered, yet compassion was excited, because they were
destroyed, not for the public good, but to satisfy the cruelty of an
individual.[12]
NOTES.
1. Pliny's Testimony to the Morality of the Christians.—The character
which this writer gives of the Christians of that age (his celebrated letter
was written to Trajan early in the second century), and which was drawn
from a pretty accurate inquiry, because he considered their moral principles
as the point in which the magistrate was interested, is as follows: He tells
the emperor that some of those who had relinquished the society, or who, to
save themselves pretended that they had relinquished it, affirmed "that they
were wont to meet together on a stated day, before it was light, and sang
among themselves alternately a hymn to Christ as a God; and to bind
themselves by an oath, not to the commission of any wickedness, but that
they would not be guilty of theft, or robbery, or adultery; that they would
never falsify their word, or deny a pledge committed to them when called
upon to return it." This proves that a morality more pure and strict than was
ordinary, prevailed at that time in Christian societies.—Paley's "Evidences."