0% found this document useful (0 votes)
4 views13 pages

Pain A Precision Signal for Reinforcement Learning and Control

This review proposes a reinforcement learning model of pain, arguing that pain functions as a motivational control signal to guide behavior and minimize harm. It highlights the complexity of pain processing in the brain, which is influenced by various cognitive and emotional factors, and addresses the challenges in understanding pain's precise role. The model suggests that pain is not merely a sensory experience but a sophisticated system that integrates learning and decision-making to optimize responses to potential harm.

Uploaded by

brunno.scrock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views13 pages

Pain A Precision Signal for Reinforcement Learning and Control

This review proposes a reinforcement learning model of pain, arguing that pain functions as a motivational control signal to guide behavior and minimize harm. It highlights the complexity of pain processing in the brain, which is influenced by various cognitive and emotional factors, and addresses the challenges in understanding pain's precise role. The model suggests that pain is not merely a sensory experience but a sophisticated system that integrates learning and decision-making to optimize responses to potential harm.

Uploaded by

brunno.scrock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Neuron

Review

Pain: A Precision Signal


for Reinforcement Learning and Control
Ben Seymour1,2,*
1Center for Information and Neural Networks, National Institute of Information and Communications Technology, 1-4 Yamadaoka, Suita,

Osaka 565-0871, Japan


2Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, UK

*Correspondence: [email protected]
https://doi.org/10.1016/j.neuron.2019.01.055

Since noxious stimulation usually leads to the perception of pain, pain has traditionally been considered sen-
sory nociception. But its variability and sensitivity to a broad array of cognitive and motivational factors have
meant it is commonly viewed as inherently imprecise and intangibly subjective. However, the core function
of pain is motivational—to direct both short- and long-term behavior away from harm. Here, we illustrate that
a reinforcement learning model of pain offers a mechanistic understanding of how the brain supports this,
illustrating the underlying computational architecture of the pain system. Importantly, it explains why pain
is tuned by multiple factors and necessarily supported by a distributed network of brain regions, recasting
pain as a precise and objectifiable control signal.

Despite the advent of brain imaging, a clear picture of how pain is suggests that a great deal of examples of endogenous control
processed in the brain has been much harder to unravel than can be explained by this process. Finally, we briefly describe
anticipated, being beset by three problems. First, pain is associ- how the model offers potential insights into how pain might
ated with robust responses in multiple and diverse brain regions, become chronic under certain conditions.
most of which are not specific to pain (at least on a macroscopic
scale), and so it has been hard to ‘‘pin down’’ the pain system to Background
any specific brain region. Second, pain is an inherently private There is a long history of theories and constructs that have
percept, but an individual’s self-reports of pain can vary widely attempted to capture the complex phenomenology of pain, but
from moment to moment, and it has remained unclear whether a number of models have played a particularly important role in
this fluctuation represents irreducible noise and subjectivity or evolving current concepts of pain. Against a historically domi-
a precise tuning of pain based on hidden factors. Third, pain is nant view that pain could be understood as a sensory system
exquisitely sensitive to a broad range of emotional, environ- like any other, Melzack and Casey (1968) highlighted what they
mental, and cognitive factors—a phenomenon called endoge- called the ‘‘man-in-the-brain’’ problem that this seemed to
nous modulation. Although this has led to an appreciation that create—the idea that the main function of the pain system was
pain is more than a simple readout of nociceptive input, it has to inform some conscious module of the nature of a particular
not led to any satisfactory unified explanation as to what pain nociceptive stimulus. Instead, they proposed the tripartite
really is. This has left the view that pain is simply a highly variable model, in which sensory-discriminative, affective-motivational,
and malleable representation of assumed actual or potential and cognitive-evaluative components are processed as part-
tissue damage. independent, part-interacting pathways (Melzack and Casey,
In this review, we propose a model of pain that centralizes its 1968). In this model, rather than the control of protective
role as a learning and control signal and argue that this can solve behavior being merely downstream to sensory processing,
these problems. We begin with a perspective of how theories of they argued it was an intrinsic and fundamental part of pain
pain have evolved over recent decades, and how insights have experience, not least because pain was clearly sensitive to so
emerged that have moved thinking beyond purely sensory many motivational and cognitive factors. A key substrate for
accounts of pain. We then argue that current accounts still modulation, namely descending pathways acting on dorsal
don’t fully capture how pain controls behavior to minimize horn neurons, had already been proposed in gate control theory
harm, which is its primary function. Importantly, although this is (Melzack and Wall, 1965). And in the brain, the model implied that
often achieved by immediate nocifensive responses, a substan- different dimensions would involve multiple different cerebral
tial part of this comes from learning—allowing an animal to miti- loci—a premonition of the distributed pattern of cortical and
gate or avoid predictable harm long into the future. The founda- subcortical pain responses that was later revealed by functional
tions of a learning account of pain are rooted in psychological neuroimaging (Jones et al., 1991; Treede et al., 1999).
models of animal learning, and we describe how these can be In light of the neurophysiological characterization of many
developed in computational terms to provide a mechanistic pain-specific receptors and ascending pathways, Craig subse-
model of the architecture of the pain system. Critically, we argue quently proposed the homeostatic model, in which sensory and
that this requires pain to be shaped by a set of factors to optimize motivational dimensions are inherently integrated as a single sys-
its role as a learning and control signal and review evidence that tem, involving pain-specific lateral thalamocortical projections to

Neuron 101, March 20, 2019 ª 2019 Elsevier Inc. 1029


Neuron

Review

the insula cortex (Craig, 2002, 2003). Craig placed pain alongside pain system successfully balances stimulus identification, infor-
other ‘‘interoceptive’’ sensations such as temperature, itch, and mation seeking, harm minimization, and, perhaps most critically,
pleasant touch, as systems supporting bodily perception with speed. Below, we outline a computational architecture of the
intrinsic motivational value, with this value related to the core ho- pain system that may achieve this, based on a framework called
meostatic drive to maintain the integrity of the body. This model reinforcement learning (RL). In so doing, this casts light on the
proposed a hierarchical sensory processing stream from poste- fundamental question of what the conscious perception of pain
rior to anterior regions of insula cortex, with this hierarchy explic- really reflects: an optimal inference of a real or presumed noci-
itly tied to physiological and behavioral homeostasis. However, it ceptive stimulus, or an optimal control signal to minimize current
was still largely unclear exactly how homeostatic behaviors were and future nociceptive stimuli?
actually implemented, and why pain was modulated by so many
factors. The RL Model
The idea that, as a motivational system, pain was directly Underlying the evolution of these concepts is the central idea
modulated as a decision by the system itself was most clearly that pain must be understood in the context of behavioral control
articulated in Fields’ motivation-decision model (Fields, 2006, to minimize current and future harm. Fundamental to this
2018). The model proposed that pain was inhibited when over- concept is learning: over and above the fact that pain elicits
shadowed by more important reward- or escape-orientated immediate defensive responses (withdrawal, orientation, etc.),
goals, with descending control mediated by opioid pathways it must also guide learning to optimize future responses. To illus-
via the periaqueductal gray (PAG) and rostral ventral medulla trate this, consider a child touching a hot stove: although the im-
(RVM) (Basbaum and Fields, 1984). The model also highlighted mediate response limits the severity of any burn, the main benefit
the fundamental role of pain in learning and the control of avoid- is though the sum of future instances when they don’t touch
ance and escape, and this explicitly motivational perspective stoves because they have a pain system. Therefore, what has
viewed pain as controlling not just immediate responses to driven the evolution of the architecture of the pain system is its
limit tissue damage, but also long-term role harm minimization role as a learning signal to prospectively reduce harm. But under-
through learned escape and avoidance (Johansen et al., 2001; standing how the brain achieves this exposes a fundamental
Navratilova et al., 2012). This re-conceived pain as an inherently problem that any experience-based control system must solve:
predictive system, not simply passively recording nociceptive the credit assignment problem (Bellman, 2013).
inputs, in which the generation of pain predictions and expecta- The Credit Assignment Problem
tions are central to the function of central pain circuits. Harm minimization is both a clearly definable and objectively
A central role for expectation and prediction also underlay the measurable function and is based on the ability to learn from
idea that pain involves a statistical (e.g., Bayesian) inference trial-and-error interaction with the world. This allows actions
resulting from integration of prior expectancy and incoming that terminate (escape) or completely avoid pain and has been
nociceptive input (Brown et al., 2008; Seymour and Dolan, well studied in humans and animal learning using Pavlovian
2013; Morton et al., 2010; Anchisi and Zanon, 2015; Tabor (classical) and instrumental (operant) conditioning (Mackintosh,
et al., 2017; Ongaro and Kaptchuk, 2019). More formally, Buchel 1983). Most of these paradigms consider simple one-step
proposed the predictive coding model (Bu € chel et al., 2014), escape or avoidance, in which pain is predicted by a single pre-
involving a hierarchical processing stream from spinal cord to ceding cue, which subsequently elicits an appropriate response.
PAG, thalamus, and posterior-to-anterior insula (Geuter et al., However, real-world learning often involves much longer se-
2017; Grahl et al., 2018; Ozawa et al., 2017). In this context, quences of events, which makes the problem of prediction and
expectations can be acquired through multiple means—through avoidance more difficult. For instance, if a series of 5 actions
instructed information, learning (i.e., conditioning), and through leads to pain at the end, how do you know which of the 5 actions
observation (Wiech, 2016; Tabor et al., 2017)—and suggested was the mistake (Figure 1A)? This problem is referred to as the
that descending control might be implementing top-down pre- credit assignment problem, which is a well-known problem in
dictions and their uncertainties, to be integrated by ascending engineering and control theory (Bellman, 2013).
nociception information and prediction errors. In so doing, this The credit assignment problem can be solved using a class of
provided an explanation of a set of instances of endogenous learning rule from the field of RL (Sutton and Barto, 1998). RL is
control, especially expectancy-based biases and placebo and effectively an extension of psychological learning models, such
nocebo responses. as the Rescorla-Wagner model (Rescorla and Wagner, 1972).
However, inferential theories of pain processing leave open an The Rescorla-Wagner model is usually applied to one-step
account of how the motivational function of pain is directed. learning and uses a prediction error term—the difference be-
At an abstract level, concepts such as Friston’s Free Energy tween what was expected and observed—to update future pre-
Framework propose that sensation, motivation, and action are dictions. But this doesn’t work well if the outcome is far into the
intrinsically related by their drive to understand the causes of future. RL models, on the other hand, don’t need to wait for the
unexpected stimuli (Friston, 2010), and the notion of ‘‘active outcome and simply use the next available prediction (formalized
inference’’ describes how actions can be conceived to ultimately as the value) as a proxy for the outcome. That is, they store a
reflect minimization of future unexpected sensory instances of value term for each action and state and compute the difference
pain (Tabor and Burr, 2019). But understanding how the brain between values at each successive time step, taking into ac-
actually achieves this is much more complex (Pezzulo et al., count any reward and punishment experienced on the way (Sut-
2015), and none of the existing models fully capture how the ton and Barto, 1998). Effectively, this allows pain predictions to

1030 Neuron 101, March 20, 2019


Neuron

Review

Figure 1. The Credit Assignment Problem and RL Framework


(A) RL provides an algorithmic framework for learning how to make optimal predictions and actions based on trial-and-error interaction with the world, in which
salient outcomes (reward and punishment) can be sensed. In particular, it aims to solve the problem of correctly allocating predictive value to preceding states, in
terms of the outcomes they eventually predict, when the transition through states of the world is either passive (not under the agent’s control, as in Pavlovian
learning) or active (determined by the agent’s actions, as in instrumental learning).
(B) This is achieved by using a prediction error term to update state or action values, which effectively transfer value predictions back in time to the earliest
predictor. Here, the prediction error is equal to the difference between the current prediction, and the sum of the subsequent prediction plus any outcome
experienced at this next state.
(C) The agent-environment interface illustrates the basic architecture comprising the agent (which learns values, computes prediction errors, and selects actions),
the internal environment (which represents states and outcomes), and the external environment (which contains the sensed objects).

be learned by looking at differences in predictions through time, stimulus specific, situation specific, and species specific (Bolles,
which passes the prediction back to the earliest reliable predic- 1971; Fanselow and Lester, 1988). Innate responses are precise,
tor (Figure 1B). This error-based mechanism can be applied to sophisticated, and rapid—driving defensive activity within a few
passive predictions (learning state values) or active predictions hundred milliseconds. They are also remarkably strong, over-
(learning action values). whelming other ongoing behavioral activity (which, as we explain
Figure 1C shows the basic architecture of RL control. By later, has critical implications for the organization of endogenous
sensing the external environment, the organism’s brain gener- control mechanisms). These features are reflected in the corre-
ates an internal representation of the current state (e.g., from sponding neural substrates within a highly complex network of
visual information) and any salient outcomes (e.g., from nocicep- spinal and brainstem connections, including dorsal horn circuits
tion). This information is passed to an ‘‘agent’’ that decides what governing motor responses, brainstem autonomic nuclei, and
responses or actions to emit based on the current stored state hypothalamic-PAG circuits driving basic behavioral programs
and action values and then updates these values based on the (Fanselow, 1994; Craig et al., 1998).
next state (Sutton and Barto, 1981). In this architecture, ‘‘pain’’ Pavlovian Learning
is the internal reinforcement signal used for learning and is Pavlovian learning allows innate responses to be activated
distinct from the nociceptive sensing process (in the same way in advance of a harmful stimulus—offering the chance to prepare
that reward is distinct from the sensory properties of a reinforcer; for, reduce, or even completely avoid it (Mackintosh, 1983;
Singh et al., 2009). Bolles, 1972). Any sensory cue that reliably precedes pain can
From its initial demonstration (Seymour et al., 2004), there is act as a predictor (a ‘‘conditioned stimulus’’), and it is known
now substantial evidence that pain controls behavior using an that acquisition of the response (the ‘‘conditioned response’’)
RL-based strategy, and that this involves a hierarchy of control depends on prediction errors. Importantly, evidence suggests
processes. These are built on a basic system of innate that the brain learns higher-order pain prediction errors—allow-
responses that exists across species, and, working together, ing the prediction to be transferred back in time to the earliest
coordination of these control processes provides a highly effec- reliable predictor, in accordance with an RL solution to the credit
tive way of minimizing future harm (Figure 2). Below, we outline assignment problem (Seymour et al., 2004). Pavlovian pain re-
the key aspects of each, and how they fit together to control sponses can be divided into two categories: pain-specific re-
pain behavior. sponses, which tend to be well-timed motor responses thought
Innate Responses to be mediated by cerebellar learning (e.g., leg flexion or eye
Nociceptive stimulation produces a broad and diverse set of mo- blink to foot or eyelid shock, respectively), and non-specific re-
tor, autonomic, and behavioral defensive responses that are sponses common to many aversive stimuli (such as withdrawal

Neuron 101, March 20, 2019 1031


Neuron

Review

Figure 2. The RL Model of Pain


This schematic details the computational architecture of the pain system. Basic brain representations of pain receive ascending spinal nociceptive input and are
used to generate the internal reinforcement signal that is used for control. These feed into separate control systems that control behavior: (1) an innate pain
response system; (2) state-based, Pavlovian learning; (3) associative action-outcome learning (model-free habit learning), and (4) a cognitive (model-based)
action learning and planning system. Collectively, learning and responses/actions emerge through pavlovian-instrumental (P-I), and model-based-model-free
interactions. Model simulation produces rapid, efficient harm avoidance, and outperforms single system models and conventional control systems for auton-
omous agents in terms of safe learning (Elfwing and Seymour, 2017). Endogenous control from these controllers reciprocally modulates the sensory pain
pathway.

and autonomic arousal), which involve coordinated subcortical outcomes, and so fundamentally influence the probability of their
network including the amygdala, ventral putamen, ventral PAG, occurrence (Mackintosh, 1983). For pain, this involves escape
VTA, and dorsal raphe (Groessl et al., 2018; Herry and Johansen, (from persistent pain) and avoidance (of phasic pain). Current
2014; Zhang et al., 2016). These dissociable systems positively evidence suggests that the brain employs a parallel system in
interact with each other (Betts et al., 1996; Pearce et al., 1981) which action values for relief and pain are simultaneously learned
and negatively interact with reward learning circuits (Seymour and interact to guide choice (Seymour et al., 2012; Eldar et al.,
et al., 2005; Konorski, 1948). Pavlovian predictions are sensitive 2016). Indeed, there is a specific advantage to learning these
to uncertainty, which enhances the learning rate and controls two values separately: relief values can approximate the best-
autonomic responses such as skin conductance putatively case scenario of future actions (‘‘what to do’’), and pain values
through an amygdala-dependent process (Li et al., 2011; Boll can approximate the worst-case scenarios (‘‘what not to do’’),
et al., 2013; Zhang et al., 2016). Pavlovian values also generalize and the two values can in principle be integrated together to
to perceptually and conceptually similar cues, allowing pain pre- guide action (‘‘multi-attribute RL’’). Learning these two values
dictions to be made to novel stimuli in an efficient way (Onat and separately in this way conserves information and allows for safer
Bu€chel, 2015; Dunsmoor et al., 2011; Dunsmoor and Kroes, behavior (Elfwing and Seymour, 2017).
2019; Koban et al., 2018). Instrumental learning involves reciprocal interactions with the
Instrumental Learning Pavlovian system. Pavlovian values and prediction errors guide
Whereas Pavlovian learning effectively deals with state learning both the learning (e.g., two-factor theory [Maia, 2010; Moutous-
(that is, conditioned responses generally prepare for but don’t sis et al., 2008]) and expression (e.g., conditioned reinforcement,
fundamentally change the probability of pain), instrumental conditioned suppression, Pavlovian- instrumental transfer [Sey-
learning allows novel actions to be learned according to their mour et al., 2005; Lawson et al., 2014; Talmi et al., 2008]) of

1032 Neuron 101, March 20, 2019


Neuron

Review

instrumental actions. Neurobiologically, escape and avoidance Conscious Pain Perception and Interactions between
are implemented in similar circuits to Pavlovian learning and Controllers
specifically involve posterior putamen and amygdala circuits In the RL framework, the ‘‘pain’’ signal acts as the primary teach-
(Menegas et al., 2018), with connections to ventromedial pre- ing signal that drives the learning of values. This raises the ques-
frontal regions encoding action values themselves (Seymour tion as to whether this reflects the conscious perception of pain
et al., 2012; Roy et al., 2014). Generalization of avoidance or exists as a distinct, subconscious entity. The multi-controller
actions involves independent process for generalization of architecture does not necessarily mean that each controller
pain-predictive (insula cortex) and relief-predictive (ventromedial uses precisely the same reinforcement signal, and, because
prefrontal cortex [VMPFC]) action values (Norbury et al., 2018). innate defense responses need to be rapid, they are inextricably
Cognitive Learning linked to nociceptive input from the very earliest parts of the
In the systems described above, both Pavlovian and instru- ascending pathway. However, slower and computationally so-
mental learning involve a process by which the brain learns a phisticated sensory processing of nociceptive signals can be
simple scalar value of a state or action and use this value to guide used by the higher cognitive controller, which is clearly directly
responses and choices, respectively, in the absence of any inter- associated with conscious processing (i.e., working memory,
nal model (‘‘model-free learning’’). However, humans clearly explicit reasoning, and planning). Furthermore, it must be the
have enormous capacity to build much more sophisticated rep- case that conscious pain can exert a direct effect on cognitive
resentations of pain, events, and contexts and use them to guide learning and control, because, even if the conscious perception
deliberative behavior. Under the umbrella term of cognitive of pain emerged as some sort of epiphenomenon of neural pro-
learning or ‘‘model-based’’ learning, these cognitive representa- cessing, its perception would still lead to avoidance of this state
tions can encode specific states, actions, and pain and formulate in the future, given the choice. For this reason, at the very least
an internal model of the individual and their environment (Tol- conscious pain must act as a control signal related to cognitive
man, 1948; Dayan and Daw, 2008). Such an internal model can learning systems.
support explicit planning, evaluation (i.e., the ability to report a More broadly, cognitive processes are built atop a hierarchy
pain prediction or intensity judgment), instructed and observa- that involves multiple facilitative interactions between layers.
tional learning, and episodic memory. Although such an architecture seems complex, computationally
Computationally, by encoding an internal map of the world, it is highly efficient (Piccolo et al., 2018; Lee et al., 2019). For
cognitive learning can infer the presence of ‘‘hidden states’’ instance, exposure to an unexpected pain stimulus recruits the
and the structure of abstract rules, including the decision pol- innate and Pavlovian systems first, to provide rapid, safe defense
icies of other intelligent agents (Behrens et al., 2018). Accord- using evolutionarily learned information. This sets the ‘‘action
ingly, it provides a model-based mechanism that subsumes priors’’ upon which a cognitive system can evaluate the causes
both Pavlovian and instrumental learning (Gershman et al., of pain based on building a model of what happened, allowing
2015). That is, cognitive processes are likely to be routinely rapid learning and inference of optimal defensive responses.
involved in simple Pavlovian and instrumental tasks, allowing When these responses can lead to the reliable avoidance of
a more sophisticated representation of task structure than is pain, the basic (model-free) instrumental system takes over
possible than using simpler ‘‘model-free’’ learning algorithms control, which provides computational efficiency and can help
(Rescorla, 1988). Naturally, however, defining precisely what al- protect against unnecessary influence by random noise (Wang
gorithms are used in cognitive learning is much more difficult to et al., 2018; Daw et al., 2005). Overall, each system is
ascertain, simply due to the potential for complexity (Daw and optimal in a given situation, and as a whole the architecture bal-
Dayan, 2014). In the case of Pavlovian conditioning, however, ances speed and computational efficiency at one end, with
Bayesian models can be shown to explain several aspects of computational sophistication at the other. The caveat, however,
learning difficult to explain using simpler RL algorithms (Courville may be a necessary susceptibility to impulsiveness and compul-
et al., 2006). More generally, the notion of model-free and model- siveness, mostly due to the strength of innate and Pavlovian
based control across state and action learning scenarios is systems (Lloyd and Dayan, 2018; Millner et al., 2018; Robbins
embedded in a long literature of emotional and habitual behavior, et al., 2012).
versus deliberative cognition (Daw, 2018). However, the reality
may be more complex, and, at least in the case of reward, there Endogenous Control
is evidence that the brain uses intermediate computational stra- The RL framework shows how potential harm can be minimized
tegies that have features of both model-free and model-based through a nested hierarchy of controllers. This raises the ques-
learning (Momennejad et al., 2017). tion as to whether we should expect pain to be a fixed, stable
Neurobiologically, cognitive learning and decision-making signal that faithfully represents the nociceptive signal, or a flex-
structures for pain are likely to include multiple regions of prefron- ible signal that adapts to the current learning context. As we
tal cortex, including ventrolateral and dorsolateral PFC, anterior discuss below, the RL architecture indeed indicates that pain
cingulate, and hippocampal regions (Atlas et al., 2016; Olsson should be modulated by a number of factors if it is to operate
and Phelps, 2007; Jeon et al., 2010; Carter et al., 2006; LeDoux optimally as a control signal.
and Daw, 2018; Qi et al., 2018). Importantly, there is evidently Modulation by Sensory Inference
overlap in many brain regions identified in cognitive learning and The available evidence suggests that the RL control hierarchy
simpler learning schemes, especially regions such as the amyg- has a corresponding sensory processing hierarchy, with crude
dala and striatum (Madarasz et al., 2016; Koban et al., 2017). spinal and brainstem nociceptive input feeding into lower

Neuron 101, March 20, 2019 1033


Neuron

Review

controllers, and conscious pain feeding into higher controllers at that two processes are at work—temporal discounting and
the top (Figure 2). The function of sensory processing hierarchies dread. Temporal discounting is a well-supported assumption in
is to allow the optimal estimate of the properties—such as the most RL models by which people discount future events as
intensity—of the external stimulus (Seymour and Dolan, 2013; a function of distance into the future (Sutton and Barto, 1998;
Bu €chel et al., 2014; Tabor et al., 2017). This estimate is ultimately Frederick et al., 2002). But people also find the process of antic-
an inference made based on prior experience and other relevant ipating pain inherently aversive in its own right—a phenomenon
information, including simple predictive contingencies, multi- called dread—which often causes them to choose sooner over
sensory integration, instructed knowledge, or observed knowl- distant pain (e.g., for a necessary dental procedure, people
edge. Based on the assumption that the incoming nociceptive might want to ‘‘get it out of the way’’ sooner so they don’t
input is inherently noisy, inference will improve the estimate of need to worry about it) (Loewenstein, 1987; Berns et al., 2006).
the true intensity. Computationally, sensory inference is typically Hence, the net behavior is the combination of a dread function
proposed to approximate some sort of Bayesian inference (Co- with a discount function, which leads to an ‘‘n’’-shaped prospec-
lombo and Seriès, 2012; Knill and Pouget, 2004) and can in prin- tive function in which predictions of future pain have an interme-
ciple explain why pain perception is routinely biased toward prior diate peak aversive latency (Story et al., 2013).
knowledge (Colloca and Benedetti, 2006; Anchisi and Zanon, It is also possible for temporal patterns of pain to act as predic-
2015; Seymour and Dolan, 2013; Wiech, 2016; Atlas et al., tors. In a classic example, decreasing pain is felt as less aversive
2010). And it is consistent with the observation that the magni- than increasing pain—the so-called ‘‘peak-end’’ effect (Kahne-
tude of this bias depends on the certainty of the prior information man et al., 1993, 1997). This implies that either temporal con-
(Brown et al., 2008; Yoshida et al., 2013). The inferred estimate structs can act as a cues for associative learning, or that people
may also be asymmetrically weighted by the cost of errors (i.e., build a more sophisticated internal model (memory) of the
under-estimating pain may be more costly than over-estimating episode to support prediction (Fiser et al., 2010).
[Rachman and Arntz, 1991]). When errors do occur (for instance, Modulation by Decision Conflict
when the discrepancy between prior and incoming nociceptive The RL model mediates a broad array of responses and actions,
information reaches a threshold), then the information within from autonomic and physiological responses, reflexive motor
the prior may need to be relearned, weakening its capability to responses (e.g., limb withdrawal), innate behavioral programs
bias future pain due to its increased uncertainty, but enhancing (freeze, fight, or flight), communicative responses (facial expres-
the way the information is used in the long run (Yu and Dayan, sion and vocalizations), and any type of instrumental motor ac-
2005; Hird et al., 2018). Although evidence suggests that in the tion (such as pressing a keyboard in a pain experiment). Within
case of pain, disconfirming sensory evidence may be relatively this set, emission of some types of response can occur indepen-
under-weighted (Jepma et al., 2018). Overall, however, modula- dently of others—for instance, pressing a key will not interfere
tion during perceptual processing creates a more accurate pain with a heart rate response or facial expression. But other types
signal that is available to higher (cognitive) RL control. of response will: for instance, innate motor responses may well
Modulation by Predictive Value interfere with instrumental actions that might be more important,
A key feature of the RL model is that it deals with sequential such as escape from danger or acquisition of a large reward that
stimuli—predicting outcomes both near and far in the future. outweighs the magnitude of the pain (Maier et al., 1982; Fields,
This accommodates the fact that pain can both act as both a 2006, 2018; Dayan et al., 2006). The problem the pain system
reinforcer (i.e., an outcome or an unconditioned stimulus, US) has in managing this decision conflict is that because the innate
and a cue for other motivationally salient outcomes (a predictor, responses are relatively hard-wired early in the ascending pain
or conditioned stimulus, CS). That is, pain can predict its own pathway, which is necessary to elicit rapid responses, the only
termination, explicit reward, or more pain (Gerber et al., 2014; way to suppress innate responses is to suppress nociceptive af-
Fields, 2018; Navratilova et al., 2015)—indeed, anything that im- ferents when or soon after they enter the dorsal horn of the spinal
proves or worsens homeostasis (Keramati and Gutkin, 2014). cord. That is, it may not be feasible to selectively modulate innate
This means that the aversiveness of pain incorporates two quan- responses without modulating ascending pain signals at the
tities—its inherent (aversive) value as an outcome and its value same time. This means that when instrumental decision circuits
as a predictor. For instance, in Pavlovian counter-conditioning, prioritize reward seeking over pain avoidance or escape, or
a subject might learn to predict a food reward following a pain when active instrumental avoidance or escape involves actions
stimulus: after training, the pain stimulus elicits no observable different from innate avoidance or escape, then pain is endoge-
aversive response, only appetitive (positive) responses antici- nously reduced (Dum and Herz, 1984).
pating food (Eroféeva, 1921). Here, it is clear that the modulation Decision conflict may also operate at the level of cognition,
of pain must be at an early level for defensive motor responses since pain inherently drives attention, learning, and planning
to be lost, but also that the pain stimulus continues to act as (Legrain et al., 2009; Van Damme et al., 2010). Thus, in the
a discriminative stimulus despite being modulated, implying face of a competing and more significant goal (i.e., escape or
that, at some level, modulation of pain must have a degree of large reward), pain may interfere with and disrupt more important
selectivity to preserve discriminative information (Melzack and cognitive processes (Eccleston and Crombez, 1999). Just as it is
Casey, 1968). the case that two physical actions may be incompatible (such as
Pain can also be a predictor for more pain, which will enhance simultaneously moving in two directions), two ‘‘mental’’ actions
its aversiveness and raises the issue of how the duration to the may also be incompatible (such as planning to simultaneously
next pain influences its aversive valuation. Evidence indicates move in two different directions) (Brown et al., 2016). Thus,

1034 Neuron 101, March 20, 2019


Neuron

Review

Box 1. Key Predictions of the RL Model

d Opposite effects of controllability and uncertainty on phasic and tonic pain. The model proposes that uncertainty and
controllability relate to a greater marginal benefit to learning, and phasic pain should be enhanced accordingly. However, in
the case of tonic pain, its relief acts as the teaching signal, and ongoing pain has a direct suppressive effect on cognition,
and so ongoing pain should be reduced to enhance relief learning. Existing support for this is mixed (Yoshida et al., 2013; Brown
et al., 2008; Zaman et al., 2017; Bra €scher et al., 2016; Wiech et al., 2006; Salomons et al., 2007; Zhang et al., 2018a), partly
because uncertainty is not always studied in the context of controllability, and increased controllability often results in reduced
uncertainty, leaving this issue to be fully demonstrated.
d Endogenous control should drive exploration. If a core function of endogenous control is to facilitate information acquisition,
this should be reflected in choice, and manifest by a direct relationship between pain modulation and exploratory action.
Furthermore, this should also be sensitive to the expected benefit of this information to future behavior—i.e., a greater
opportunity to exploit information relates to greater endogenous control (cf. Wilson et al., 2014).
d Pain discrimination is preserved during endogenous analgesia. If pain can act as both a cue and an outcome, it is important
that the capacity to discriminate pain remains unimpaired in the context of endogenous control. Although evidence suggests
that descending modulatory pathways have a selective end target in the dorsal horn (Heinricher et al., 2009), there is yet no
behavioral evidence, for instance, that fine discrimination (e.g., spatial or intensity) is preserved in endogenous analgesia.

decision conflict may invoke endogenous control at both the to high uncertainty), or top-down processes provided by external
level of action and cognition. cues or instruction. In all cases, the effect is to enhance learning
Modulation by Informational Value and guide choice in a way that benefits long-run prospects.
The fact that attention and controllability reliably modulate pain Thus, any relatively unexpected change in persistent or repetitive
(Eccleston and Crombez, 1999; Wiech et al., 2006; Yoshida pain will have a modulatory effect: reductions in persistent noci-
et al., 2013; Salomons et al., 2007, 2015; Taylor et al., 2017; ceptive stimulation will cause an exaggerated reduction in pain
Bra €scher et al., 2016), beyond that which can be explained by perception, and increases in nociceptive stimulation will cause
mechanisms above, suggests that factors intrinsic to learning an exaggerated increase in perceived pain. These effects are
and control specifically modulate pain. Although the goal of RL well recognized in studies of relative valuation of pain (Winston
is to learn to minimize pain as an objective function, performance et al., 2014; Vlaev et al., 2009) and offset hypoalgesia and onset
can be enhanced by intrinsically modulating pain according to its hyperalgesia (Sprenger et al., 2018; Grill and Coghill, 2002; Yelle
informational value in learning. This is because the prospective et al., 2009).
benefit of learning is not a fixed quantity but varies according Neural Implementation of Endogenous Analgesia
to how much there is to learn (uncertainty), how long there is to The primary effector pathway for endogenous control (both
exploit learnable information (opportunity), and the capability to hypo- and hyperalgesia) is known to involve descending control
exploit it (controllability). In the case of reward, such intrinsic via the PAG to rostral ventral medulla, to the dorsal horn of the
modulation of decision value is well recognized (for instance, in spinal cord (Heinricher et al., 2009). What has been harder to
novelty seeking and uncertainty seeking) and helps solve the ascertain is which higher brain sites instruct this pathway, and
exploration-exploitation problem of trial-and-error learning (i.e., where and how the amount of descending control is computed.
information sampling [Wilson et al., 2014; Wittmann et al., 2008]). One of the difficulties is that many classic paradigms of endog-
In the case of pain, therefore, the magnitude of a phasic pain enous control may actually involve several distinct mechanisms,
stimulus should be enhanced if uncertainty, opportunity, and so it is difficult to relate computational mechanisms to specific
controllability are high, because the marginal benefit of learning neural loci without a considerable degree of uncertainty. For
is higher. More precisely, the model predicts these factors instance, placebo analgesia can involve all four of the above
should interact, because the benefit of learning is only manifest mechanisms. However, several cortical regions seem to play
if the opportunity and controllability are both significant (Zhang a key role, including regions of anterior cingulate cortex,
et al., 2018b) (see Box 1). In the case of learning relief from tonic dorsolateral prefrontal cortex, and insula (Wiech, 2016; Tracey,
pain, the opposite effect should occur (i.e., background pain 2010). More specifically, the pregenual anterior cingulate cortex
should be reduced if the benefit to learning about relief is high), (pgACC) has emerged as the most consistently implicated
because the object of learning is relief, not pain (and persistent cortical region in human endogenous control paradigms,
pain exerts a tonic control effect on behavior, as we discuss including in placebo and expectancy hypoalgesia (Wager et al.,
below). This appears to be the case—relief uncertainty reduces 2004; Bingel et al., 2006; Eippert et al., 2009), uncertainty-based
background tonic pain when relief uncertainty is high, when relief analgesia (Zhang et al., 2018a, 2018b), controllability (Salomons
is controllable (Zhang et al., 2018a). et al., 2007, 2015), habituation (Bingel et al., 2007), stress-
Across these demonstrations, uncertainty-based modulation induced analgesia (Yilmaz et al., 2010), and even analgesia
reflects the mechanism underlying what is conventionally induced by motor cortex stimulation (Peyron et al., 2007). The
considered attention or salience (Eccleston and Crombez, pgACC is highly opioid rich and sits within an anatomical network
1999). This spans attention that is driven by bottom-up pro- with connections to PAG and other subcortical regions involved
cesses learned through trial and error (i.e., lots of errors equate in pain and learning, including amygdala, VMPFC, hippocampus,

Neuron 101, March 20, 2019 1035


Neuron

Review

lateral orbitofrontal cortex (OFC), and PFC (Margulies et al., sequential learning, and so on. This may have either a direct
2007; Vogt, 2005). These sites are central to pain and reward effect on pain perception or an indirect effect: in the fear-
learning and directly implicated in control by decision conflict avoidance model of musculoskeletal pain, excessive fear
(Fields, 2018), sensory inference (Bu €chel et al., 2014), and value learning of movement leads to inactivity, which itself causes
learning (Craig, 2003; Seymour et al., 2004; Ploghaus et al., increased tissue injury through secondary means (Vlaeyen and
1999). Furthermore, the pgACC is also implicated in both rodent Linton, 2000).
models and human clinical cases of chronic pain (Qu et al., 2011; From a computational perspective, therefore, it is likely that in-
Segerdahl et al., 2018; Mano et al., 2018). dividual factors (i.e., the parameters of the RL model) act as indi-
What has been less clear is the specificity of modulation of vidual risk factors for chronic pain, that subsequently interact to
ascending pathways in the dorsal horn. An inherent paradox of generate the chronic pain phenotype given appropriate external
endogenous control paradox is that it risks degrading the infor- events (such as a precipitating tissue injury). This would define a
mation that pain carries as a predictive stimulus. Hence, it is ‘‘computome’’ of chronic pain risk and illustrates many ways in
likely that at least some aspects of discriminitive information which some of the individual factors might be shared with co-
must be selectively preserved in endogeonous control in the incident psychiatric conditions, such as depression and anxiety
ascending pathways (Box 1). Indeed, psychologically, preserva- (Figure 3). This framework also offers a computational framework
tion of discriminative perception accompanying analgesia with to start to address some of the neurobiological differences seen
opioids and cingulotomy is well described (Melzack and Casey, in RL-linked circuits in chronic pain patients, including VMPFC
1968) and forms the basis of a conventional notion of dissocia- and nucleus accumbens (Baliki et al., 2010, 2012; Mano et al.,
bility of affective and discriminative pain processing in putative 2018). However, given the complexity of the RL model (in terms
‘‘medial and lateral pain systems,’’ respectively (Vogt and Sikes, of its architecture and large number of parameters), it strongly
2000; Corder et al., 2019). Recent evidence indicates that this appeals to simulation methods to help predict how different
selectivity may be mediated by preferential control of C-fibers factors might conspire together to generate chronic pain risk
over A-delta fibers in the dorsal horn (Heinricher et al., 2009), (Seymour and Lee, 2019).
which fits with the notion that the A-delta fiber pathways carry
more refined discriminative information. Conclusions
Both theory and evidence point to a view of pain as a precision
Translation to Chronic Pain signal that guides prospective behavior to minimize harm
Tonic or persistent pain after injury serves several physiological through learning. The pain system has been shaped through
functions. First, it has a direct effect on mood and cognition, evolution by the complexity and diversity of actual threats in
encouraging rest and recuperation by reducing motivation to the natural world, but, in particular, it has faced four problems
engage in non-essential reward-guided activity, which is less that have had a dominant impact on its architecture. The first
important for homeostatic priorities. Second, it represents a is how to learn about harm both near and far into the future
state from which reduction or cessation of pain becomes a (the credit assignment problem), which is solved by the predic-
new motivational goal and hence frames relief as an objective tive value learning system defined by RL. The second is how
function for appetitive RL. Third, when accompanied by hyperal- to balance speed of response with processing sophistication
gesia and allodynia, it sensitizes otherwise less or non-noxious (a type of speed-accuracy dilemma), which is solved by having
stimuli to drive pain learning, which is clearly adaptive given a nested hierarchical architecture that spans rapid reflexes to in-
that the area of injury may be more prone to further injury than ternal models, and which interact through endogenous control.
normal. The third is how to balance information acquisition about threat
This raises the question as to why physiologically persistent with the concurrent need to avoid it (the information sampling
pain outlives its usefulness to become pathologically persistent dilemma), which is solved by endogenous fine-tuning of pain to
pain in some individuals. Clearly, chronic pain is heterogeneous, maximize its value as a learning signal. And the fourth is how
and many forms of chronic pain could simply reflect a normal to suppress pain when needed, while not suppressing the infor-
brain response to increased nociceptive input at peripheral or mation it carries, which is solved by having dissociable discrim-
spinal levels. More commonly, however, it is likely that peripheral inative and affective subcomponents of pain. Overall, the RL
and central factors interact to generate the chronic pain state: model of the pain system illustrates computationally how these
persistent nociceptive signals are further amplified and main- solutions are implemented in the brain, and how this drives
tained by aberrant brain processes. safe, efficient, rapid, and effective pain behavior.
The RL model of pain illustrates many specific computational The model also offers insight into the three broad issues in pain
mechanisms that could hypothetically contribute to this process. neuroscience raised in the introduction. From the perspective of
These could be perceptual: with persistent pain reflecting an the representation of pain in the brain (the ‘‘pain matrix’’), it is
inference of the state of injury but subject to an excessive and clear that pain is constructed not only from nociceptive input,
irrefutable belief that this is the case biasing perceptual inference but also from a set of cortical and subcortical components that
(a sort of ‘‘self-fulfilling prophecy’’ [Jepma et al., 2018]). Or it compute the effective magnitude of pain as a control signal.
could be motivational: such as excessive aversive valuation, That this implies that subjective pain will be best estimated
elevated or asymmetrical aversive learning rates, over-general- from responses in multiple regions is consistent with general
ization, loss of any component of endogenous control, reduced network and connectivity (Kucyi and Davis, 2015) and multivar-
extinction of movement-related fear, excessive dread, aberrant iate ‘‘signatures’’ of pain (Wager et al., 2013; Woo et al., 2015;

1036 Neuron 101, March 20, 2019


Neuron

Review

modulation is primarily descending, there should still be a


restricted cortical response that primarily reflects pain aversive
intensity after modulation (for example, in posterior insula or
mid-anterior cingulate cortex [Segerdahl et al., 2015; Kragel
et al., 2018; Craig and Craig, 2009]). In other words, although
there is always the most information available from a broad set
of brain regions (involving processes that are not individually
unique to pain), it must also be the case that unique and funda-
mental representations of discriminative features and pain value
are bound together to yield the unique subjective experience of
pain. However, key questions remain, and perhaps the most
important is knowing where in the brain internal representations
of pain used for cognitive planning and control are coded
(i.e., where is the ‘‘cognitive map’’ of pain?).
From the perspective of the subjectivity of pain, the RL model
challenges the primacy of self-report. This is because at a
fundamental level pain concerns control, and so control behavior
should serve as the ultimate measure of pain. Irrespective of
the problems associated with self-report scales (Stewart et al.,
2005), pain leads to a broad set of learning and control behaviors
that can be objectively measured, and the conscious perception
of pain merely serves these functions. This speaks to Melzack’s
and Casey’s man-in-the-brain problem—pain perception is
merely a link (albeit it a critical one) in a self-organizing control
circuit, rather than a terminal node that informs an elusive higher
controller. On this basis, the RL model’s prediction is that all
aspects of endogenous control should be reflected in subse-
quent choice behavior, and this remains an important prediction
for future studies (see Box 1).
Finally, the model yields a concept of pain as a signal that is
tuned precisely to its function as a control signal. The concept
that pain is modulated presupposes that pain is a sensory noci-
ceptive signal whose primary role is to retrospectively estimate
the objective intensity of a stimulus and then needs to be tuned
to support whatever behavior is required at the time, i.e., the view
of modulation of pain as post-perceptual processing. But as a
prospective control signal, pain behaves in precisely the way it
needs, and hence should not be considered to be modulated
at all. Although the factors that lead to the tuning of pain may
be difficult to experimentally observe and objectify (such as
complex social information about forthcoming pain), this doesn’t
Figure 3. The Hypothetical Chronic Pain Computome mean that the brain doesn’t estimate and represent these
The RL model involves a large set of parameters that determine the way in
which learning and decision making operate within the control framework. quantities in a precise manner. Ultimately, endogenous control
Several of these have been proposed to be involved with not only chronic pain, illustrates how pain is constructed on a moment-by-moment
but also other disorders associated with aversive learning, including anxiety basis based on an complex but objectively definable integration
and depression. The figure shows a schematic of how a series of hypothetical
factors might operate together to create an overall risk for chronic pain, given
of broad sources of information.
an appropriate peripheral drive from an injury, and these factors might
have different roles at different points in the pain chronification process. ACKNOWLEDGMENTS
Importantly, the way these factors interact is determined by the complexity of
the computational (RL) model, which defines the information-processing
We thank Josh Johansen, Howard Fields, Jayne Pickering, Flavia Mancini, and
operations they govern, which means predicting how different combinations
Tim Salomons for discussions and critical comments on the manuscript. We
determine risk may require advanced simulation platforms (Seymour and
acknowledge funding from the National Institute of Information and Communi-
Lee, 2019).
cations Technology (Japan), The Wellcome Trust (097490), and Arthritis
Research UK (Versus Arthritis: 21357, 21192).
Marquand et al., 2010; Zunhammer et al., 2018) but goes beyond
these by highlighting the importance of understanding exactly REFERENCES
what each of the nodes in the pain network do (i.e., pain as
Anchisi, D., and Zanon, M. (2015). A Bayesian perspective on sensory and
a computational network). However, it is also consistent with cognitive integration in pain perception and placebo analgesia. PloS One 10,
cortical specificity models, because as long as endogenous e0117270.

Neuron 101, March 20, 2019 1037


Neuron

Review
Atlas, L.Y., Bolger, N., Lindquist, M.A., and Wager, T.D. (2010). Brain media- Craig, A.D. (2002). How do you feel? Interoception: the sense of the physiolog-
tors of predictive cue effects on perceived pain. J. Neurosci. 30, 12964–12977. ical condition of the body. Nat. Rev. Neurosci 3, 655.

Atlas, L.Y., Doll, B.B., Li, J., Daw, N.D., and Phelps, E.A. (2016). Instructed Craig, A.D. (2003). A new view of pain as a homeostatic emotion. Trends
knowledge shapes feedback-driven aversive learning in striatum and orbito- Neurosci. 26, 303–307.
frontal cortex, but not the amygdala. eLife 5, 5.
Craig, A.D., and Craig, A. (2009). How do you feel–now? The anterior insula and
Baliki, M.N., Geha, P.Y., Fields, H.L., and Apkarian, A.V. (2010). Predicting human awareness. Nat. Rev. Neurosci 10, 59–70.
value of pain and analgesia: nucleus accumbens response to noxious stimuli
changes in the presence of chronic pain. Neuron 66, 149–160. Craig, A., Bowsher, D., Tasker, R.R., Lenz, F., Dougherty, P.M., and Wiesen-
feld-Hallin, Z. (1998). A new version of the thalamic disinhibition hypothesis
Baliki, M.N., Petre, B., Torbey, S., Herrmann, K.M., Huang, L., Schnitzer, T.J., of central pain. Pain Forum 7, 1–28.
Fields, H.L., and Apkarian, A.V. (2012). Corticostriatal functional connectivity
predicts transition to chronic back pain. Nat. Neurosci. 15, 1117–1119. Daw, N.D. (2018). Are we of two minds? Nat. Neurosci 21, 1497.

Basbaum, A.I., and Fields, H.L. (1984). Endogenous pain control systems: Daw, N.D., and Dayan, P. (2014). The algorithmic anatomy of model-based
brainstem spinal pathways and endorphin circuitry. Annu. Rev. Neurosci. 7, evaluation. Phil. Trans. R. Soc. B: Biol. Sci. 369, 1655.
309–338.
Daw, N.D., Niv, Y., and Dayan, P. (2005). Uncertainty-based competition
Behrens, T.E.J., Muller, T.H., Whittington, J.C.R., Mark, S., Baram, A.B., Sta- between prefrontal and dorsolateral striatal systems for behavioral control.
chenfeld, K.L., and Kurth-Nelson, Z. (2018). What is a cognitive map? Organ- Nat. Neurosci 8, 1704.
ising knowledge for flexible behaviour. Neuron 100, 490–509.
Dayan, P., and Daw, N.D. (2008). Decision theory, reinforcement learning, and
Bellman, R. (2013). Dynamic Programming (Courier Corporation). the brain. Cogn. Affect. Behav. Neurosci. 8, 429–453.
Berns, G.S., Chappelow, J., Cekic, M., Zink, C.F., Pagnoni, G., and Martin- Dayan, P., Niv, Y., Seymour, B., and Daw, N.D. (2006). The misbehavior of
Skurski, M.E. (2006). Neurobiological substrates of dread. Science 312, value and the discipline of the will. Neural Netw. 19, 1153–1160.
754–758.
Dum, J., and Herz, A. (1984). Endorphinergic modulation of neural reward
Betts, S.L., Brandon, S.E., and Wagner, A.R. (1996). Dissociation of the block- systems indicated by behavioral changes. Pharmacol. Biochem. Behav. 21,
ing of conditioned eyeblink and conditioned fear following a shift in US locus. 259–266.
Anim. Learn. Behav. 24, 459–470.
Dunsmoor, J.E., and Kroes, M.C. (2019). Episodic memory and Pavlovian
€chel, C. (2006). Mecha-
Bingel, U., Lorenz, J., Schoell, E., Weiller, C., and Bu conditioning: ships passing in the night. Curr. Opin. Behav. Sci. 26, 32–39.
nisms of placebo analgesia: rACC recruitment of a subcortical antinociceptive
network. Pain 120, 8–15. Dunsmoor, J.E., White, A.J., and LaBar, K.S. (2011). Conceptual similarity pro-
motes generalization of higher order fear learning. Learn. Mem. 18, 156–160.
€chel, C., and May, A. (2007). Habituation
Bingel, U., Schoell, E., Herken, W., Bu
to painful stimulation involves the antinociceptive system. Pain 131, 21–30. Eccleston, C., and Crombez, G. (1999). Pain demands attention: A cognitive-
affective model of the interruptive function of pain. Psychol. Bull. 125, 356.
€chel, C. (2013). Sepa-
Boll, S., Gamer, M., Gluth, S., Finsterbusch, J., and Bu
rate amygdala subregions signal surprise and predictiveness during associa-
Eippert, F., Bingel, U., Schoell, E.D., Yacubian, J., Klinger, R., Lorenz, J., and
tive fear learning in humans. Eur. J. Neurosci. 37, 758–767. €chel, C. (2009). Activation of the opioidergic descending pain control sys-
Bu
tem underlies placebo analgesia. Neuron 63, 533–543.
Bolles, R.C. (1971). Species-specific defense reactions. In Aversive Condition-
ing and Learning, R.F. Brush, ed. (Elsevier), pp. 183–233.
Eldar, E., Hauser, T.U., Dayan, P., and Dolan, R.J. (2016). Striatal structure and
Bolles, R.C. (1972). The avoidance learning problem. Psychol. Learn. Motiv. function predict individual biases in learning to avoid pain. Proc. Natl. Acad.
6, 97–145. Sci. USA 113, 4812–4817.

Bra€scher, A.-K., Becker, S., Hoeppli, M.E., and Schweinhardt, P. (2016). Elfwing, S., and Seymour, B. (2017). Parallel reward and punishment control in
Different brain circuitries mediating controllable and uncontrollable pain. humans and robots: safe reinforcement learning using the MaxPain algorithm.
J. Neurosci. 36, 5013–5025. In 7th Joint IEEE International Conference on Development and Learning and
on Epigenetic Robotics, ICDL-EpiRob 2017 (IEEE), pp. 140–147.
Brown, C.A., Seymour, B., Boyle, Y., El-Deredy, W., and Jones, A.K. (2008).
Modulation of pain ratings by expectation and uncertainty: Behavioral charac- Eroféeva, M.N. (1921). Further observations upon conditioned reflexes to
teristics and anticipatory neural correlates. Pain 135, 240–250. nocuous stimuli. Bulletin of the Institute of Lesgaft 3.

Brown, T.I., Carr, V.A., LaRocque, K.F., Favila, S.E., Gordon, A.M., Bowles, B., Fanselow, M.S. (1994). Neural organization of the defensive behavior system
Bailenson, J.N., and Wagner, A.D. (2016). Prospective representation of navi- responsible for fear. Psychon. Bull. Rev. 1, 429–438.
gational goals in the human hippocampus. Science 352, 1323–1326.
Fanselow, M.S., and Lester, L.S. (1988). A functional behavioristic approach to
Bu€chel, C., Geuter, S., Sprenger, C., and Eippert, F. (2014). Placebo analgesia: aversively motivated behavior: Predatory imminence as a determinant of the
a predictive coding perspective. Neuron 81, 1223–1239. topography of defensive behavior. In Evolution and Learning, R.C. Bolles
and M.D. Beecher, eds. (Lawrence Erlbaum Associates), pp. 185–212.
Carter, R.M., O’Doherty, J.P., Seymour, B., Koch, C., and Dolan, R.J. (2006).
Contingency awareness in human aversive conditioning involves the middle Fields, H.L. (2006). A motivation-decision model of pain: the role of opioids. In
frontal gyrus. Neuroimage 29, 1007–1012. Proceedings of the 11th World Congress on Pain, H. Flor, ed., pp. 449–459.

Colloca, L., and Benedetti, F. (2006). How prior experience shapes placebo Fields, H.L. (2018). How expectations influence pain. Pain 159 (Suppl 1 ),
analgesia. Pain 124, 126–133. S3–S10.

Colombo, M., and Seriès, P. (2012). Bayes in the brain—on Bayesian modelling Fiser, J., Berkes, P., Orbán, G., and Lengyel, M. (2010). Statistically optimal
in neuroscience. Br. J. Philos. Sci. 63, 697–723. perception and learning: from behavior to neural representations. Trends
Cogn. Sci. 14, 119–130.
Corder, G., Ahanonu, B., Grewe, B.F., Wang, D., Schnitzer, M.J., and Scherrer,
G. (2019). An amygdalar neural ensemble that encodes the unpleasantness of Frederick, S., Loewenstein, G., and O’donoghue, T. (2002). Time discounting
pain. Science 363, 276–281. and time preference: A critical review. J. Econ. Lit. 40, 351–401.

Courville, A.C., Daw, N.D., and Touretzky, D.S. (2006). Bayesian theories of Friston, K. (2010). The free-energy principle: a unified brain theory? Nat. Rev.
conditioning in a changing world. Trends Cogn. Sci. 10, 294–300. Neurosci 11, 127.

1038 Neuron 101, March 20, 2019


Neuron

Review
Gerber, B., Yarali, A., Diegelmann, S., Wotjak, C.T., Pauli, P., and Fendt, M. associated with primary punishment in humans. Proc. Natl. Acad. Sci. USA
(2014). Pain-relief learning in flies, rats, and man: basic research and applied 111, 11858–11863.
perspectives. Learn. Mem. 21, 232–252.
LeDoux, J., and Daw, N.D. (2018). Surviving threats: neural circuit and compu-
Gershman, S.J., Norman, K.A., and Niv, Y. (2015). Discovering latent causes in tational implications of a new taxonomy of defensive behaviour. Nat. Rev. Neu-
reinforcement learning. Curr. Opin. Behav. Sci. 5, 43–50. rosci. 19, 269–282.

€chel, C. (2017). Functional dissociation


Geuter, S., Boll, S., Eippert, F., and Bu Lee, J.H., Seymour, B., Leibo, J.Z., An, S.J., and Lee, S.W. (2019). Toward
of stimulus intensity encoding and predictive coding of pain in the insula. eLife high-performance, memory-efficient, and fast reinforcement learning—Les-
6, https://doi.org/10.7554/eLife.24770. sons from decision neuroscience. Science Robotics 4.

Grahl, A., Onat, S., and Bu €chel, C. (2018). The periaqueductal gray and Legrain, V., Damme, S.V., Eccleston, C., Davis, K.D., Seminowicz, D.A., and
Bayesian integration in placebo analgesia. eLife 7, e32930. Crombez, G. (2009). A neurocognitive model of attention to pain: behavioral
and neuroimaging evidence. Pain 144, 230–232.
Grill, J.D., and Coghill, R.C. (2002). Transient analgesia evoked by noxious
stimulus offset. J. Neurophysiol. 87, 2205–2208. Li, J., Schiller, D., Schoenbaum, G., Phelps, E.A., and Daw, N.D. (2011).
Differential roles of human striatum and amygdala in associative learning.
Groessl, F., Munsch, T., Meis, S., Griessner, J., Kaczanowska, J., Pliota, P., Nat. Neurosci. 14, 1250–1252.
Kargl, D., Badurek, S., Kraitsy, K., Rassoulpour, A., et al. (2018). Dorsal
tegmental dopamine neurons gate associative learning of fear. Nat. Neurosci Lloyd, K., and Dayan, P. (2018). Pavlovian-instrumental interactions in active
21, 952. avoidance: The bark of neutral trials. Brain Res. Published online October 9,
2018. https://doi.org/10.1016/j.brainres.2018.10.011.
Heinricher, M.M., Tavares, I., Leith, J.L., and Lumb, B.M. (2009). Descending
control of nociception: Specificity, recruitment and plasticity. Brain Res. Brain Loewenstein, G. (1987). Anticipation and the valuation of delayed consump-
Res. Rev. 60, 214–225. tion. Econ. J. (Lond.) 97, 666–684.

Herry, C., and Johansen, J.P. (2014). Encoding of fear learning and memory in Mackintosh, N.J. (1983). Conditioning and Associative Learning (Clarendon
distributed neuronal circuits. Nat. Neurosci. 17, 1644–1654. Press Oxford).

Hird, E.J., Charalambous, C., El-Deredy, W., Jones, A.K., and Talmi, D. (2018). Madarasz, T.J., Diaz-Mataix, L., Akhand, O., Ycu, E.A., LeDoux, J.E., and
Boundary effects of expectation in human pain perception. bioRxiv, 467738. Johansen, J.P. (2016). Evaluation of ambiguous associations in the amygdala
by learning the structure of the environment. Nat. Neurosci 19, 965.
Jeon, D., Kim, S., Chetana, M., Jo, D., Ruley, H.E., Lin, S.Y., Rabah, D., Kinet,
Maia, T.V. (2010). Two-factor theory, the actor-critic model, and conditioned
J.P., and Shin, H.S. (2010). Observational fear learning involves affective pain
avoidance. Learn. Behav. 38, 50–67.
system and Ca v 1.2 Ca 2+ channels in ACC. Nat. Neurosci 13, 482–488.
Maier, S.F., Drugan, R.C., and Grau, J.W. (1982). Controllability, coping
Jepma, M., Koban, L., van Doorn, J., Jones, M., and Wager, T.D. (2018).
behavior, and stress-induced analgesia in the rat. Pain 12, 47–56.
Behavioural and neural evidence for self-reinforcing expectancy effects on
pain. Nat. Hum. Behav 2, 838–855. Mano, H., Kotecha, G., Leibnitz, K., Matsubara, T., Sprenger, C., Nakae, A.,
Shenker, N., Shibata, M., Voon, V., Yoshida, W., et al. (2018). Classification
Johansen, J.P., Fields, H.L., and Manning, B.H. (2001). The affective compo-
and characterisation of brain network changes in chronic back pain: A multi-
nent of pain in rodents: direct evidence for a contribution of the anterior cingu- center study. Wellcome Open Res. 3, 19.
late cortex. Proc. Natl. Acad. Sci. USA 98, 8077–8082.
Margulies, D.S., Kelly, A.M., Uddin, L.Q., Biswal, B.B., Castellanos, F.X., and
Jones, A.K., Brown, W.D., Friston, K.J., Qi, L.Y., and Frackowiak, R.S. (1991). Milham, M.P. (2007). Mapping the functional connectivity of anterior cingulate
Cortical and subcortical localization of response to pain in man using positron cortex. Neuroimage 37, 579–588.
emission tomography. Proc. Biol. Sci. 244, 39–44.
Marquand, A., Howard, M., Brammer, M., Chu, C., Coen, S., and
Kahneman, D., Fredrickson, B.L., Schreiber, C.A., and Redelmeier, D.A. Mourão-Miranda, J. (2010). Quantitative prediction of subjective pain intensity
(1993). When more pain is preferred to less: Adding a better end. Psychol. from whole-brain fMRI data using Gaussian processes. Neuroimage 49,
Sci. 4, 401–405. 2178–2189.
Kahneman, D., Wakker, P.P., and Sarin, R. (1997). Back to Bentham? Explo- Melzack, R., and Casey, K.L. (1968). Sensory, motivational, and central control
rations of experienced utility. Q. J. Econ. 112, 375–406. determinants of pain. In International Symposium on the Skin Senses, D. Ken-
shalo, ed. (C.C. Thomas), pp. 423–435.
Keramati, M., and Gutkin, B. (2014). Homeostatic reinforcement learning for
integrating reward collection and physiological stability. eLife 3, e04811. Melzack, R., and Wall, P.D. (1965). Pain mechanisms: a new theory. Science
150, 971–979.
Knill, D.C., and Pouget, A. (2004). The Bayesian brain: the role of uncertainty in
neural coding and computation. Trends Neurosci. 27, 712–719. Menegas, W., Akiti, K., Amo, R., Uchida, N., and Watabe-Uchida, M. (2018).
Dopamine neurons projecting to the posterior striatum reinforce avoidance
Koban, L., Jepma, M., Geuter, S., and Wager, T.D. (2017). What’s in a of threatening stimuli. Nat. Neurosci. 21, 1421–1430.
word? How instructions, suggestions, and social information change pain
and emotion. Neurosci. Biobehav. Rev. 81 (Pt A), 29–42. Millner, A.J., Gershman, S.J., Nock, M.K., and den Ouden, H.E.M. (2018).
Pavlovian control of escape and avoidance. J. Cogn. Neurosci 30, 1379–1390.
Koban, L., Kusko, D., and Wager, T.D. (2018). Generalization of learned pain
modulation depends on explicit learning. Acta Psychol. (Amst.) 184, 75–84. Momennejad, I., Russek, E.M., Cheong, J.H., Botvinick, M.H., Daw, N., and
Gershman, S.J. (2017). The successor representation in human reinforcement
Konorski, J. (1948). Conditioned reflexes and neuron organization (CUP learning. Nat. Hum. Behav 1, 680.
Archive).
Morton, D.L., El-Deredy, W., Watson, A., and Jones, A.K. (2010). Placebo
Kragel, P.A., Kano, M., Van Oudenhove, L., Ly, H.G., Dupont, P., Rubio, A., analgesia as a case of a cognitive style driven by prior expectation. Brain
Delon-Martin, C., Bonaz, B.L., Manuck, S.B., Gianaros, P.J., et al. (2018). Res. 1359, 137–141.
Generalizable representations of pain, cognitive control, and negative emotion
in medial frontal cortex. Nat. Neurosci 21, 283. Moutoussis, M., Bentall, R.P., Williams, J., and Dayan, P. (2008). A temporal
difference account of avoidance learning. Network 19, 137–160.
Kucyi, A., and Davis, K.D. (2015). The dynamic pain connectome. Trends Neu-
rosci. 38, 86–95. Navratilova, E., Xie, J.Y., Okun, A., Qu, C., Eyde, N., Ci, S., Ossipov, M.H.,
King, T., Fields, H.L., and Porreca, F. (2012). Pain relief produces negative
Lawson, R.P., Seymour, B., Loh, E., Lutti, A., Dolan, R.J., Dayan, P., Weiskopf, reinforcement through activation of mesolimbic reward-valuation circuitry.
N., and Roiser, J.P. (2014). The habenula encodes negative motivational value Proc. Natl. Acad. Sci. USA 109, 20709–20713.

Neuron 101, March 20, 2019 1039


Neuron

Review
Navratilova, E., Atcherley, C.W., and Porreca, F. (2015). Brain circuits encod- Segerdahl, A.R., Themistocleous, A.C., Fido, D., Bennett, D.L., and Tracey, I.
ing reward from pain relief. Trends Neurosci. 38, 741–750. (2018). A brain-based pain facilitation mechanism contributes to painful dia-
betic polyneuropathy. Brain 141, 357–364.
Norbury, A., Robbins, T.W., and Seymour, B. (2018). Value generalization in
human avoidance learning. eLife 7, e34779. Seymour, B., and Dolan, R. (2013). Emotion, motivation, and pain. In Wall &
Melzack’s Textbook of Pain, S. McMahon, ed. (Elsevier Health Sciences),
Olsson, A., and Phelps, E.A. (2007). Social learning of fear. Nat. Neurosci pp. 248–255.
10, 1095.
Seymour, B., and Lee, S.W. (2019). Decision-making in brains and robots: the
€chel, C. (2015). The neuronal basis of fear generalization in
Onat, S., and Bu case for an interdisciplinary approach. Curr. Opin. Behav. Sci. 26, 137–145.
humans. Nat. Neurosci. 18, 1811–1818.
Seymour, B., O’Doherty, J.P., Dayan, P., Koltzenburg, M., Jones, A.K., Dolan,
Ongaro, G., and Kaptchuk, T.J. (2019). Symptom perception, placebo effects, R.J., Friston, K.J., and Frackowiak, R.S. (2004). Temporal difference models
and the Bayesian brain. Pain 160, 1–4. describe higher-order learning in humans. Nature 429, 664–667.
Ozawa, T., Ycu, E.A., Kumar, A., Yeh, L.F., Ahmed, T., Koivumaa, J., and Seymour, B., O’Doherty, J.P., Koltzenburg, M., Wiech, K., Frackowiak, R.,
Johansen, J.P. (2017). A feedback neural circuit for calibrating aversive mem- Friston, K., and Dolan, R. (2005). Opponent appetitive-aversive neural pro-
ory strength. Nat. Neurosci 20, 90. cesses underlie predictive learning of pain relief. Nat. Neurosci. 8, 1234–1240.
Pearce, J.M., Montgomery, A., and Dickinson, A. (1981). Contralateral transfer Seymour, B., Daw, N.D., Roiser, J.P., Dayan, P., and Dolan, R. (2012).
of inhibitory and excitatory eyelid conditioning in the rabbit. Q. J. Exp. Psychol. Serotonin selectively modulates reward value in human decision-making.
Sec. B 33, 45–61. J. Neurosci. 32, 5833–5842.
Peyron, R., Faillenot, I., Mertens, P., Laurent, B., and Garcia-Larrea, L. (2007). Singh, S., Lewis, R.L., and Barto, A.G. (2009). Where do rewards come from. In
Motor cortex stimulation in neuropathic pain. Correlations between analgesic Proceedings of the Annual Conference of the Cognitive Science Society,
effect and hemodynamic changes in the brain. A PET study. Neuroimage 34, pp. 2601–2606.
310–321.
Sprenger, C., Stenmans, P., Tinnermann, A., and Bu €chel, C. (2018). Evidence
Pezzulo, G., Rigoli, F., and Friston, K. (2015). Active inference, homeostatic
for a spinal involvement in temporal pain contrast enhancement. Neuroimage
regulation and adap-tive behavioural control. Prog. Neurobiol. 134, 17–35.
183, 788–799.
Piccolo, L., Libera, F., Bonarini, A., Seymour, B., and Ishiguro, H. (2018). Pain
Stewart, N., Brown, G.D., and Chater, N. (2005). Absolute identification by rela-
and self-preservation in autonomous robots: From neurobiological models
tive judgment. Psychol. Rev. 112, 881.
to psychiatric disease. In 7th Joint IEEE International Conference on Develop-
ment and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017,
Story, G.W., Vlaev, I., Seymour, B., Winston, J.S., Darzi, A., and Dolan, R.J.
pp. 263–270.
(2013). Dread and the disvalue of future pain. PLoS Comput. Biol 9, e1003335.
Ploghaus, A., Tracey, I., Gati, J.S., Clare, S., Menon, R.S., Matthews, P.M., and
Sutton, R.S., and Barto, A.G. (1981). Toward a modern theory of adaptive net-
Rawlins, J.N. (1999). Dissociating pain from its anticipation in the human brain.
works: expectation and prediction. Psychol. Rev. 88, 135.
Science 284, 1979–1981.

Qi, S., Hassabis, D., Sun, J., Guo, F., Daw, N., and Mobbs, D. (2018). How Sutton, R.S., and Barto, A.G. (1998). Reinforcement learning: An introduction,
cognitive and reactive fear circuits optimize escape decisions in humans. Vol. 1. 1 (Cambridge: MIT press).
Proc. Natl. Acad. Sci. USA 115, 3186–3191.
Tabor, A., and Burr, C. (2019). Bayesian Learning Models of Pain: A Call to Ac-
Qu, C., King, T., Okun, A., Lai, J., Fields, H.L., and Porreca, F. (2011). Lesion of tion. Curr. Opin. Behav. Sci. 26, 54–61.
the rostral anterior cingulate cortex eliminates the aversiveness of sponta-
neous neuropathic pain following partial or complete axotomy. Pain 152, Tabor, A., Thacker, M.A., Moseley, G.L., and Körding, K.P. (2017). Pain: a sta-
1641–1648. tistical account. PLoS Comput. Biol 13, e1005142.

Rachman, S., and Arntz, A. (1991). The overprediction and underprediction of Talmi, D., Seymour, B., Dayan, P., and Dolan, R.J. (2008). Human pavlovian-
pain. Clin. Psychol. Rev. 11, 339–355. instrumental transfer. J. Neurosci. 28, 360–368.

Rescorla, R.A. (1988). Pavlovian conditioning: It’s not what you think it is. Am. Taylor, V.A., Chang, L., Rainville, P., and Roy, M. (2017). Learned expectations
Psychol 43, 151. and uncertainty facilitate pain during classical conditioning. Pain 158,
1528–1537.
Rescorla, R.A., and Wagner, A.R. (1972). A theory of Pavlovian conditioning:
Variations in the effectiveness of reinforcement and nonreinforcement. In Clas- Tolman, E.C. (1948). Cognitive maps in rats and men. Psychol. Rev. 55, 189.
sical Conditioning II: Current Research and Theory, A.H. Black and W.F. Pro-
kasy, eds. (Appleton-Century-Crofts), pp. 64–99. Tracey, I. (2010). Getting the pain you expect: mechanisms of placebo, nocebo
and reappraisal effects in humans. Nat. Med 16, 1277.
Robbins, T.W., Gillan, C.M., Smith, D.G., de Wit, S., and Ersche, K.D. (2012).
Neurocognitive endophenotypes of impulsivity and compulsivity: towards Treede, R.-D., Kenshalo, D.R., Gracely, R.H., and Jones, A.K.P. (1999). The
dimensional psychiatry. Trends Cogn. Sci. 16, 81–91. cortical representation of pain. Pain 79, 105–111.

Roy, M., Shohamy, D., Daw, N., Jepma, M., Wimmer, G.E., and Wager, T.D. Van Damme, S., Legrain, V., Vogt, J., and Crombez, G. (2010). Keeping pain in
(2014). Representation of aversive prediction errors in the human periaque- mind: a motivational account of attention to pain. Neurosci. Biobehav. Rev. 34,
ductal gray. Nat. Neurosci. 17, 1607–1612. 204–213.

Salomons, T.V., Johnstone, T., Backonja, M.M., Shackman, A.J., and David- Vlaev, I., Seymour, B., Dolan, R.J., and Chater, N. (2009). The price of pain and
son, R.J. (2007). Individual differences in the effects of perceived controllability the value of suffering. Psychol. Sci. 20, 309–317.
on pain perception: critical role of the prefrontal cortex. J. Cogn. Neurosci. 19,
993–1003. Vlaeyen, J.W., and Linton, S.J. (2000). Fear-avoidance and its consequences
in chronic musculoskeletal pain: a state of the art. Pain 85, 317–332.
Salomons, T.V., Nusslock, R., Detloff, A., Johnstone, T., and Davidson, R.J.
(2015). Neural emotion regulation circuitry underlying anxiolytic effects of Vogt, B.A. (2005). Pain and emotion interactions in subregions of the cingulate
perceived control over pain. J. Cogn. Neurosci. 27, 222–233. gyrus. Nat. Rev. Neurosci 6, 533.

Segerdahl, A.R., Mezue, M., Okell, T.W., Farrar, J.T., and Tracey, I. (2015). The Vogt, B.A., and Sikes, R.W. (2000). The medial pain system, cingulate cortex,
dorsal posterior insula subserves a fundamental role in human pain. Nat. Neu- and parallel processing of nociceptive information. Prog. Brain Res. 122,
rosci 18, 499–500. 223–235.

1040 Neuron 101, March 20, 2019


Neuron

Review
Wager, T.D., Rilling, J.K., Smith, E.E., Sokolik, A., Casey, K.L., Davidson, R.J., Yelle, M.D., Oshiro, Y., Kraft, R.A., and Coghill, R.C. (2009). Temporal filtering
Kosslyn, S.M., Rose, R.M., and Cohen, J.D. (2004). Placebo-induced changes of nociceptive information by dynamic activation of endogenous pain modula-
in FMRI in the anticipation and experience of pain. Science 303, 1162–1167. tory systems. J. Neurosci. 29, 10264–10271.

Wager, T.D., Atlas, L.Y., Lindquist, M.A., Roy, M., Woo, C.W., and Kross, E. Yilmaz, P., Diers, M., Diener, S., Rance, M., Wessa, M., and Flor, H. (2010).
(2013). An fMRI-based neurologic signature of physical pain. N. Engl. J. Brain correlates of stress-induced analgesia. Pain 151, 522–529.
Med. 368, 1388–1397.
Yoshida, W., Seymour, B., Koltzenburg, M., and Dolan, R.J. (2013). Uncer-
Wang, O., Lee, S.W., O’Doherty, J., Seymour, B., and Yoshida, W. (2018). tainty increases pain: evidence for a novel mechanism of pain modulation
Model-based and model-free pain avoidance learning. Brain Neurosci. Adv. involving the periaqueductal gray. J. Neurosci. 33, 5638–5646.
2, 2398212818772964.
Yu, A.J., and Dayan, P. (2005). Uncertainty, neuromodulation, and attention.
Wiech, K. (2016). Deconstructing the sensation of pain: The influence of cogni- Neuron 46, 681–692.
tive processes on pain perception. Science 354, 584–587.
Zaman, J., Vanpaemel, W., Aelbrecht, C., Tuerlinckx, F., and Vlaeyen, J.W.S.
(2017). Biased pain reports through vicarious information: A computational
Wiech, K., Kalisch, R., Weiskopf, N., Pleger, B., Stephan, K.E., and Dolan, R.J.
approach to investigate the role of uncertainty. Cognition 169, 54–60.
(2006). Anterolateral prefrontal cortex mediates the analgesic effect of ex-
pected and perceived control over pain. J. Neurosci. 26, 11501–11509.
Zhang, S., Mano, H., Ganesh, G., Robbins, T., and Seymour, B. (2016).
Dissociable learning processes underlie human pain conditioning. Curr. Biol.
Wilson, R.C., Geana, A., White, J.M., Ludvig, E.A., and Cohen, J.D. (2014). 26, 52–58.
Humans use directed and random exploration to solve the explore-exploit
dilemma. J. Exp. Psychol. Gen 143, 2074. Zhang, S., Mano, H., Lee, M., Yoshida, W., Kawato, M., Robbins, T.W., and
Seymour, B. (2018a). The control of tonic pain by active relief learning. eLife
Winston, J.S., Vlaev, I., Seymour, B., Chater, N., and Dolan, R.J. (2014). 7, e31949.
Relative valuation of pain in human orbitofrontal cortex. J. Neurosci. 34,
14526–14535. Zhang, S., Yoshida, W., Mano, H., Yanagisawa, T., Shibata, K., Kawato, M.,
and Seymour, B. (2018b). Endogenous Controllability of Closed-loop Brain
Wittmann, B.C., Daw, N.D., Seymour, B., and Dolan, R.J. (2008). Striatal activ- Machine Interfaces for Pain. bioRxiv. https://doi.org/10.1101/369736.
ity underlies novelty-based choice in humans. Neuron 58, 967–973.
Zunhammer, M., Bingel, U., and Wager, T.D.; Placebo Imaging Consortium
Woo, C.-W., Roy, M., Buhle, J.T., and Wager, T.D. (2015). Distinct brain (2018). Placebo effects on the neurologic pain signature: a meta-analysis
systems mediate the effects of nociceptive input and self-regulation on pain. of individual participant functional magnetic resonance imaging data. JAMA
PLoS Biol 13, e1002036. Neurol. 75, 1321–1330.

Neuron 101, March 20, 2019 1041

You might also like