Narayanan - Knwledge-Based Action Representations For Metaphor and Aspect
Narayanan - Knwledge-Based Action Representations For Metaphor and Aspect
Committee in charge:
Professor Jerome A. Feldman, Chair
Professor George Lako
Professor Robert Wilensky
1997
Knowledge-based Action Representations for Metaphor
and Aspect (KARMA)
Copyright 1997
by
Srinivas Sankara Narayanan
1
Abstract
This report describes a new computational model for verb semantics. A novel feature of
the model is an active representation of motion and manipulation verb phrases such as
walk, push, slide, slip, etc. that can be used for control and monitoring as well as real-
time simulative inference in language understanding. Monitoring and control parameters
abstracted from the basic model provide semantic grounding for interpreting aspectual
expressions and seem to oer elegant ways to solve the vexing linguistic problem of aspectual
composition. Our model is able to use metaphoric projections of motion verbs to infer in
real-time important features of abstract plans and events potentially explaining the frequent
use of motion and manipulation terms in discourse about abstract plans. These ideas are
demonstrated through an implemented model of aspect and metaphoric reasoning about
event descriptions that interprets simple discourse segments from newspaper stories in the
domain of international economics.
Contents
1 Introduction 1
1.1 Contributions : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 14
1.2 Road map : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15
2 Language and Embodiment: A Computational Approach 16
2.1 Constraints From Brain Function : : : : : : : : : : : : : : : : : : : : : : : : 17
2.2 Theories Of Embodiment in Cognitive Semantics : : : : : : : : : : : : : : : 19
2.3 The NTL Hypothesis : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 21
3 An Executable Model of Action Verbs 24
3.1 Motivation for X-schemas : : : : : : : : : : : : : : : : : : : : : : : : : : : : 26
3.1.1 Biological control theory : : : : : : : : : : : : : : : : : : : : : : : : : 26
3.1.2 AI and Robotics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 28
3.1.3 Computational models of cognition : : : : : : : : : : : : : : : : : : : 29
3.1.4 Psychology and Cognitive Linguistics : : : : : : : : : : : : : : : : : 29
3.2 Computational Requirements for X-schemas : : : : : : : : : : : : : : : : : : 30
3.3 Representation Of X-Schemas : : : : : : : : : : : : : : : : : : : : : : : : : : 33
3.3.1 Extensions to the basic x-schema : : : : : : : : : : : : : : : : : : : : 37
3.4 Modeling with X-schemas : : : : : : : : : : : : : : : : : : : : : : : : : : : : 41
3.4.1 Modeling control primitives : : : : : : : : : : : : : : : : : : : : : : : 41
3.4.2 Modeling resources and goals : : : : : : : : : : : : : : : : : : : : : : 44
3.5 X-schemas and Verb Semantics : : : : : : : : : : : : : : : : : : : : : : : : : 47
3.5.1 Language and x-schema execution parameters : : : : : : : : : : : : : 48
3.6 Learning Action Verbs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 60
3.7 Plan analysis with x-schema representations : : : : : : : : : : : : : : : : : : 61
3.7.1 Temporal Projection : : : : : : : : : : : : : : : : : : : : : : : : : : : 61
3.7.2 Reachability : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 62
3.8 Comparison to Other Models : : : : : : : : : : : : : : : : : : : : : : : : : : 64
3.8.1 Schemas in natural language understanding : : : : : : : : : : : : : : 64
3.8.2 Action representations for planning : : : : : : : : : : : : : : : : : : : 66
3.8.3 Plan executives and production systems : : : : : : : : : : : : : : : : 67
3.8.4 Behavior controllers and reactive planners : : : : : : : : : : : : : : : 68
3.8.5 Teleo-reactive programs : : : : : : : : : : : : : : : : : : : : : : : : : 70
iv
3.9 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 71
3.10 Appendix A : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72
3.10.1 The Stochastic Case : : : : : : : : : : : : : : : : : : : : : : : : : : : 75
4 Connectionist Inference and X-schemas 79
4.1 Connectionist Encoding of X-schemas : : : : : : : : : : : : : : : : : : : : : 79
4.1.1 Focal Clusters : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 83
4.2 X-schema encoding of general-purpose structured connectionist primitives : 87
4.2.1 Implementing Binder units as x-schemas : : : : : : : : : : : : : : : : 88
4.2.2 Implementing -and units as x-schemas : : : : : : : : : : : : : : : : 89
4.2.3 Implementing -btu nodes as x-schemas : : : : : : : : : : : : : : : : 90
4.3 Discussion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 92
5 A Computational Model of Verbal Aspect 93
5.1 Basic Result : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 94
5.2 A Brief Note on Linguistic Theory : : : : : : : : : : : : : : : : : : : : : : : 95
5.3 Relevant Process Parameters : : : : : : : : : : : : : : : : : : : : : : : : : : 97
5.3.1 The Schema controller : : : : : : : : : : : : : : : : : : : : : : : : : : 98
5.4 Links to Verb Semantics : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 99
5.4.1 X-schema parameters and Inherent Aspect : : : : : : : : : : : : : : 102
5.4.2 Reasoning in multiple time scales : : : : : : : : : : : : : : : : : : : : 107
5.4.3 Traditional Aktionsart classes : : : : : : : : : : : : : : : : : : : : : : 109
5.5 Results Of The Aspect Model : : : : : : : : : : : : : : : : : : : : : : : : : : 110
5.5.1 Deriving the VDT scheme : : : : : : : : : : : : : : : : : : : : : : : : 110
5.5.2 Some additional implications : : : : : : : : : : : : : : : : : : : : : : 112
5.5.3 Modeling Phasal Aspect : : : : : : : : : : : : : : : : : : : : : : : : : 113
5.5.4 Interaction of Phasal and Inherent Aspect : : : : : : : : : : : : : : : 115
5.5.5 Imperfective paradox : : : : : : : : : : : : : : : : : : : : : : : : : : : 117
5.5.6 Problems with Telicity : : : : : : : : : : : : : : : : : : : : : : : : : : 124
5.5.7 Iterative readings of Progressives : : : : : : : : : : : : : : : : : : : : 125
5.5.8 Dynamic time scale shifting : : : : : : : : : : : : : : : : : : : : : : : 126
5.5.9 Perfectivizing operators : : : : : : : : : : : : : : : : : : : : : : : : : 127
5.6 Current Work and Extensions : : : : : : : : : : : : : : : : : : : : : : : : : : 129
5.6.1 Perfect revisited : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 129
5.6.2 What about tense? : : : : : : : : : : : : : : : : : : : : : : : : : : : : 131
5.6.3 Metaphoric extensions : : : : : : : : : : : : : : : : : : : : : : : : : : 134
5.6.4 Other dimensions of aspectual composition : : : : : : : : : : : : : : 135
5.7 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 135
5.8 Appendix 5A: Using the Multiple Time Scale Hypothesis For Abstraction : 136
6 A Compositional Theory of X-schemas 138
6.1 Motivating Example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 140
6.2 Using X-schemas For Embodied Domain Inferences : : : : : : : : : : : : : : 140
6.3 Generalization and Formal Denitions of Inter X-schema Relations : : : : : 153
6.4 Conclusion : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 161
v
Acknowledgements
I once heard Jerry Feldman say words to the following eect:
Supervising a thesis is like writing a research paper under adverse circumstances.
I can only observe in light of all my adventures over the last two years that he handles
adversity extremely well. Jerry Feldman is the ideal advisor. It is rare to nd people with
his vision and knowledge who also have the patience to understand and deal with a stream
of half-baked ideas, and more importantly take the time to sort out the important from
the merely cosmetic. His ability and dedication to see through the seduction of techniques
and mechanisms to the essence of a problem is truly inspiring, as is his ability to manage
multiple projects in dierent elds and yet be able to dive into detailed technical discussions
about any one of them. Every chapter of this thesis has beneted enormously as a result of
discussions and ideas from Jerry Feldman. In particular, Chapter 6 is a direct result of his
ideas about how to build compositional theories of x-schemas and their link to ne-grained
productions.
It is a rare privilege to be able to work with a public intellectual who is widely
acknowledged to be amongst the most important and original thinkers in Cognitive Science.
Working closely with George Lako, I have been fortunate to observe his scholarship and
ability to communicate important ideas to members of dierent disciplines. His enthusiasm,
unlimited supply of linguistic examples, and support for this project has been an important
motivational factor. To me, his book with Mark Johnson, Metaphors We Live By continues
to be an exemplary standard to aim for in communicating complex ideas. Somewhat in awe,
I note that in the time it took to perform the work and write this thesis, George has written
three books (including a 750 page treatise on philosophy!) on subjects as wide ranging as
Politics and Morality, Cognitive Mathematics, and Philosophy. I am glad somebody can do
it!
I would like to thank Robert Wilensky for his comments on draft versions of this
document. His early suggestion that I investigate the encoding of communicative intent
in metaphoric usage has proven to be most productive and is an aspect of metaphor that
remains incompletely studied and understood both within linguistics and AI . Preliminary
results from the study reported here show that NLP systems that attempt to disam-
biguate such discourse level phenomena must be able to extract evaluative, intentional, and
motivational information from metaphoric usage.
viii
Being part in the L0 group at ICSI provided me with a great opportunity to work
with linguists, computer scientists and psychologists. My special thanks go to fellow traveler
and colleague David Bailey for formal and informal sessions thrashing out many of the
ideas in this thesis, to Lokendra Shastri for his ideas linking general purpose connectionist
reasoning to motor control and x-schemas (much of the second half of Chapter 4 was his
work) to Dean Grannes (who implemented the connectionist version of x-schemas in the
SHRUTI system) and to Nancy Chang, David Andre, Masha Brodsky, and Jonathan Segal.
Thanks to Michael Thielscher for ideas connecting x-schemas to dynamic logic, to Dan
Jurafsky (for insights from garden-paths), Terry Regier, Ben Gomes, Collin Baker, Jane
Edwards, Sarah Shull, Laura Michaelis, Hubert Dreyfus, Len Talmy, Chuck Fillmore, Eve
Sweetser, Stuart Russell, and J. Dasgupta, for critiquing and oering advice that aected
various parts of this and other work while at Berkeley.
I owe a great debt to N.S. Sridharan for his encouragement and advice through
my years in Silicon Valley. Others who shaped my life in in various ways include Malcolm
Acock (for tips on the best hiking trails), Perry Thorndike, Keith Wescourt, Lee Brownston,
Jean-Claude Latombe, Chella and Sujatha Rajan, Amy Chang, Kristy Blaise, Vic Shubin,
D.D. Sharma, Anil Nair, and Susan Holeridge.
A special thanks to Gurleen Grewal for being who she is. Of course, all that I have
accomplished was made possible through the love and unagging support of my parents. I
dedicate this work to them.
1
Chapter 1
Introduction
This thesis is motivated by the observation that narratives about abstract plans
and events are often lled with words and constructions that pertain to spatial motion and
manipulation. To illustrate this phenomenon, consider a fairly typical newspaper article
that appeared in the New York Times on Sunday, August 1, 1993.1
Example 1
Britain was deep in recession while Germany was ourishing three years ago.
France kept moving ahead steadily long after Germany had fallen into recession.
But now France is plunging deeper while the German economy continues to
struggle. Britain has been taking small steps toward stimulating its economy
by cutting interest rates, and has nally started to emerge from recession.
and memory. Since knowledge of moving around or manipulating objects is essential for
survival, it has to be highly compiled and readily accessible knowledge. Representations
meeting these criteria must be context sensitive and allow changing input context to dra-
matically aect the correlation between input and memory and thereby the set of possible
expectations, goals, and inferences. Speakers are able to felicitously exploit this context-
sensitivity in specifying important information about abstract actions and plans that take
place in complex, uncertain and dynamically changing environments.
In Chapter 2, we place this work in the overall context of the Neural Theory Of
Language ( NTL ) project underway at U.C. Berkeley and ICSI. The NTL project and
this work in particular is inherently inter-disciplinary incorporating results and insights
from computer science, structured connectionism, linguistics, cognitive science, and bio-
logical control theory. The central hypothesis pursued here is that the meaning of spatial
motion terms is grounded in ne-grained, executable action representations that encode
key monitoring and control information about the action. Simple abstractions as well as
conventionalized metaphors project these features onto abstract domains such as economics
enabling linguistic devices to use motion terms to describe features of abstract actions and
processes. In this thesis, we emphasize a computer science perspective on the problem and
are centrally concerned with issues of knowledge representation and real-time inference.
Consequently, the main methodology used is computational modeling and simulation.
A crucial part of the NTL hypothesis is that motion and manipulation words
(such as pull, push, pull, walk, run, stumble, and fall) are grounded in highly compiled
parameterized action controllers that encode detailed control and monitoring information.
Dierent action verbs and associated grammatical devices express parameters of the un-
derlying action such as rates, manner, forces, and posture and movement control parame-
ters, as well as associated intentional and motivational states such as goals, resources,
energy and eort. Thus our theory requires that the representation of the semantics of
action words be grounded in active structures with a ne granularity that are exible and
adaptive in the way context aects interpretation. Furthermore the representation should
support concurrent activities in the memory storage, retrieval, and indexing process.
Chapter 3 describes a computational model that is able to satisfy the represen-
tational requirements outlined above. Our representation is partly inspired by results in
high-level cortical motor control schemas (Sternberg et al. 1978 Bernstein 1967), leading
CHAPTER 1. INTRODUCTION 3
us to refer to our verb model as x-schemas. X-schemas are parameterized routines with
internal state that execute when invoked. 2 Our computational model of x-schemas is an
extension to Petri nets (Murata 1989). Our extensions allow run-time binding, hierarchical
action sets and stochastic transitions. Chapter 3 states and proves a theorem establishing
the formal connection between x-schemas and Petri nets which allows us to tap into the
vast literature on Petri nets for specication, design and analysis of x-schemas.
until destination
bindings and modify its execution behavior appropriately (altering manner, direction, and
being done when the destination is reached). Similarly the step x-schema may take stepsize
or durations as run-time parameters and implement the behavior dierently for dierent
parameter settings. Besides being parameterized, to be properly responsive in a complex
and dynamic environment x-schemas have to be tightly integrated to perceptual modalities.
For instance, x-schemas must be able to respond appropriately to various perceptual and
postural conditions and coordinate them to a state of readiness before starting an action.
Thus to be ready to walk, several posture and resource pre-conditions (such as the fact that
the agent must be standing and have enough energy) as well as certain world conditions
(such as stable ground ahead) must be met.
The key facts to note here are that x-schemas are executable representations that
are dierent from pre-compiled behaviors in that they allow for exible execution based on
run-time bindings. During execution, x-schemas can model conditional actions and eects
(initiate sensing action, branch on results). Their ne-grained structure and asynchronous
dynamics allows them to continuously monitor and respond to changes in world state.
Their distributed control and ability to control parallel and concurrent actions allows them
to model complex, coordinated actions. Our x-schema-based model of motion verbs clearly
suggests that the semantics of such verbs is also inherently dynamic in this manner.
The interface between schemas and language consists of bi-directional interactions
with Feature-structures or f{structs (signifying the similarity to traditional feature-value
structures in linguistics). Figure 1.2 shows a small part of this interface. Linguistic input
codes for features such as verb label, path, manner, and aspect which when parsed instantiate
the f{struct by setting specic features to specic values. Thus the result of the parsing
process provides information and supplies parameters required for the initiation and control
of the underlying x-schema. In this thesis, we assume the presence of a parser which is able
to instantiate the f{struct to specic values based on the linguistic input. As an example
of this process, consider the linguistic input \take small steps" from Example 1. Here the
result of the parsing process species that the x-schema being referred to is walk and that
the step size parameter has value small. The input to our model is the linking structure
with the parameter values specied as described above. This parameter setting translates to
executing the walk x-schema with a small step size. The interface is bi-directional in that
x-schema executions can modify the f{struct as well. For instance, executing the walk x-
CHAPTER 1. INTRODUCTION 5
Schema Agent Rate Dest Size Dir Aspect Manner Energy Attitude Ground Dist(src) Dist(dest)
Walk Self Dest1 Small Forw Prog 543 Stable 123 876
until destination
schema with a small step size corresponds to less distance traveled per-step, less energy
used, etc. Feature values are thus updated depending on the specic execution trajectory
of the x-schema. This way x-schema executions are able to add information to the initial
parse by setting or modifying feature-values in the f{struct.
In this thesis, we will use the bi-directional interaction described above for real-
time context-sensitive simulative inference. The crucial thing to notice is that x-schemas are
always active in the context of a world state as well as agent internal state. While linguistic
input supplies some parameters, the execution trajectory for a particular episode depends
on internal motivational factors (such as the available energy) as well as the perceived world
state (such as whether the ground ahead is stable). This allows the x-schema based model
to be context-sensitive both to internal states (goals, resources) and perceived world states
and evolutions.
While x-schemas provide us with a formal computational approach to investigate
the use of dynamic models for grounding verb semantics, the NTL hypothesis further
requires us to be able to express such models in terms of possible neural implementations.
To this end, Chapter 4 answers the question How might x-schemas be neurally encoded ? We
look at the question from two dierent directions, with some rather surprising results. First,
we look at a connectionist encoding of x-schemas and show that primitives developed for
logical reasoning by Shastri and his students as part of the shruti connectionist reasoning
system (Shastri & Ajjanagadde 1993), are sucient to model the various parameters and
control behaviors required for higher level motor control. The second part of Chapter 4
looks at general connectionist primitives and nds that they can be readily and naturally
encoded as x-schemas. Since these primitives form the backbone of a connectionist general
purpose reasoning system, x-schemas seem to capture some of the basic inferential primitives
required for general-purpose reex reasoning.
Returning to the narrative in Example 1, the reader may already have noticed
the frequent reference to internal temporal details and phases of events. For instance,
constructions like \kept moving ahead", \taking small steps", and \is plunging", \continues
to struggle" presuppose that the motion in question was ongoing , and entail that the
current focus is on the continuation of the motion, while expressions like \had fallen" seem
to focus on the consequences of the described event. Similarly expressions like \started
to emerge" clearly enable focus on specic stages of an event as it unfolds in time. The
CHAPTER 1. INTRODUCTION 7
study of linguistic devices which prole and focus on the internal temporal character of
an event is referred to in linguistic research as the study of \aspect". In conjunction with
tense, aspect forms a primary means for \grounding" (back-grounding and fore-grounding)
and \relating" events in narrative (Berman & Slobin 1994). Since most narratives are
about events and their relations, tense and aspect are essential components of any Natural
Language Understanding (NLU) system. Traditionally, it has been especially dicult to
come up with a compositional semantics of aspect because in all languages studied, the
natural or inherent semantics of a situation combine with and modify the viewpoint or
perspective adopted by a speaker. This makes a compositional account of the semantics
of aspect dicult giving rise to many paradoxes and problems (Dowty 1979 Moens &
Steedman 1988).
Chapter 5 demonstrates through computer simulation that a compositional se-
mantics of aspect can be constructed if we take seriously the representation of action in a
neural system. Our solution relies on an abstract x-schema called the controller, that
captures a control generalization over many individual x-schemas. The controller is itself an
x-schema and models important regularities that are relevant in the evolution of processes
(enabling, inception, in-process, completion, suspension, resumption, etc.). We propose
that such controllers may be directly coded in our neural circuitry and be available to
higher level cognitive processes such as language interpretation and problem solving. The
semantics of aspect arise from the bi-directional interaction of the generalized controller
with the specic underlying x-schema for the verb-phrase in question. This active model
of aspect is able to model cross-linguistic variation in aspectual expressions while resolv-
ing paradoxes and problems in model-theoretic and other traditional accounts. Chapter 5
provides evidence for our proposition by describing the results of a computer simulation of
our dynamic system model of aspect. While our current implementation is unable to model
tense, Chapter 5 also describes ongoing eorts to construct a compositional semantics of
the tense/aspect combination.
In Figure 1.3, we see the walk x-schema now recoded as the dynamic interaction
of two x-schemas, namely the controller x-schema with nodes bound to dierent phases
(stages) of walking. Linguistic devices such as aspect speciers can activate any of the
nodes of the controller. Activation of a controller node species the execution state of the
underlying x-schema resulting in a modied execution trajectory for dierent settings. For
CHAPTER 1. INTRODUCTION 8
iterate
CONTROLLER X-SCHEMA
resume
interrupt
until destination
WALK X-SCHEMA
Figure 1.3: The controller x-schema models important regularities in process monitoring
and control. Grammatical devices such as morphological modiers refer to specic nodes
in the controller x-schema, which binds to specic activation states of the verb-phrase
x-schema, generating the required interpretation. Shown above are the cases where an
ongoing activity is specied using the progressive construction (English be V-ing) or the
consequent state is picked out using the perfect construction (English has V-ed).
CHAPTER 1. INTRODUCTION 9
instance, the Perfect aspect (represented by the grammatical construction has V-ed, as in
\has walked to the store") activates the consequent state (the done node) of the controller,
which focuses on the consequences of walking to the store (being there, buying things, etc.).
Contrast this to the use of the Progressive construction, (represented by the be + V-ing
construction in English as in \is walking to the store"). Here the process node is active,
focusing on facts such as that the walker is actively taking steps toward the store, expending
energy, moving closer to the store, etc.
Thus, meaning arises from the dynamic binding of a specic activation state of
the controller x-schema to a specic activation state of the verb-complex x-schema. The
structure of the controller and the relevant set of features that characterize the verb-complex
jointly control the compositional possibilities. Chapter 5 describes these features and the
results of our computational model on various problems and paradoxes in traditional ac-
counts.
While abstraction from motion control and monitoring primitives seems to be a
powerful method to address problems in verb semantics, it still doesn't explain the high
frequency usage of motion and manipulation terms to describe features of international
economics. Here we come to the second part of the thesis, which is based on the observation
that systematic metaphors project motion features onto abstract domains such as economics,
enabling linguistic devices to use motion terms to describe features of abstract actions and
processes.
In our theory, metaphors such as the Event Structure Metaphor (Lako 1994)
use the dense and familiar causal structure of spatial motions and object manipulations
to permit reex (fast, unconscious) inferences. Conventionalized projections from motion
features allows the speaker to use terms that rely on the highly familiar structure of spatial
motion and object manipulation to communicate complex scenarios, dynamically changing
goals, resources, monitoring conditions, and communicative intent.
Figure 1.4 shows a simple example of this phenomenon. Consider the use of the
phrase \Britain has been taking small steps : : : " in Example 1. First, the word \small"
codes for the size of an individual step as small. Second, the construction \has been taking"
codes for the fact that the steps are ongoing and continuing (the agent is continuing to
walk). Taking small steps also implies that there is a large distance still to go to reach
the destination, and the low step granularity indicates that the walker is able to monitor
CHAPTER 1. INTRODUCTION 10
Projections
Context-Sensitive Activation
Schema Dest Aspect Dir Size Manner Energy Attitude Ground Dist(src) Dist(dest)
Triggering Inference
Figure 1.4: Metaphors project motion features onto features of abstract actions and policies.
Maps may project objects and roles between domains (shown as rectangles) or actions
and events along with their aspect and other parameters (shown as hexagonal maps) from
the domain of spatial motion and manipulation to the abstract domain of international
economics. Shown are the maps that become active while processing the input \Britain
is taking small steps : : : ". Entries directly aected by linguistic input are shown in italics
(Aspect = Prog, Schema = Walk, Size = Small) and the target domain feature Country
= Britain (not shown)). Projections of these features and others inferred from x-schema
executions results in setting the target feature structure to the values shown.
CHAPTER 1. INTRODUCTION 11
and be quite responsive to changing world conditions (such as unstable ground ahead,
etc.) between steps (responsiveness between steps decreases with larger step sizes or higher
rates). Crucially, all the implications above stem from the knowledge of walking and taking
steps. Now consider the eect of projecting these features onto the domain of abstract
economic policies and actions. The well known cross-linguistic mapping from the domain
of spatial movement and manipulation onto the domain of abstract actions is called the
Event Structure metaphor (Lako 1994). From conventionalized metaphoric mappings such
as Action IS Motion, Actors ARE Movers, Size of Motion IS Granularity of Action, and
Distance to Destination IS Degree of Completion the hearer is able to conclude that Britain
is making small policy adjustments to an ongoing recovery policy. The recovery is likely
in its early stages (with a low degree of completion) and the incremental nature of the
adjustments allows the British policy makers to be responsive to an unpredictable and
changing economic environment. All this information is readily communicated to non-
expert economists (who are, however, experts in walking and moving around). Of course,
other similar expressions such as cautious step, giant steps, measured steps, large strides,
adroit move, sidestep and rush headlong are routinely found in discourses about economic
policies and readily interpretable in exactly the same manner.
From Figure 1.4, it is clear that in order to investigate the metaphoric projections
of familiar motion and manipulation terms through computer modeling, we minimally need
the following entities:
1. A representation of the highly familiar and compiled source domain of the map-
ping/projection (the bottom hexagon in Figure 1.4). A crucial requirement for the rep-
resentation of the source domain is that the x-schema properties of context-sensitivity,
high responsiveness, and real-time inference be preserved.
2. A representation of the ontology and relations between concepts present in the inter-
national economic discourse (the top rectangle entitled target domain f{struct
in Figure 1.4). This domain is called the target domain (the target of the mapping).
In this thesis, the target domain is the domain of international economic policies. One
crucial requirement for designing a representation of the target domain is the abil-
ity to compute the global impact of new observations and evidence both from direct
linguistic inputs and from projections of inferential products of x-schema executions.
CHAPTER 1. INTRODUCTION 12
schema executions onto features of the domain of international economics. When active,
metaphor maps allow the real-time inferential products of x-schema executions to inuence
the target domain network by setting evidence for specic features of a given temporal slice
of economic policy network to specic values. A novel feature of our model of metaphor
maps is their ability to project bindings over multiple time steps (both forward and back-
ward in time), an ability essential for reasoning about event descriptions (ref. Chapter 8
for details). In addition, the projections are context-sensitive in that activating a metaphor
map requires that the context (domain of discourse) be identied as being about the target
domain, thus shutting o projections otherwise. As mentioned earlier, metaphor maps are
rst-class objects that can be arranged in hierarchies. The implemented model is capable
of projecting events (along with their aspect and other parameters), as well as roles and ob-
jects across domains. Chapter 8 also describes the basic computational model for metaphor
interpretation which crucially depends on the quick inferences provided by x-schema exe-
cutions. The implementation is described through a trace of the program on a discourse
fragment from a newspaper story in international economics.
To study and model the nature and type of information provided by motion terms
in the domain of international economics, we collected a set of around 30 two-to-three-line
discourse fragments from standard sources such as The New York Times, Wall Street Jour-
nal and The Economist. These \stories" drove the development of our program (the set
of x-schemas, the target domain Belief network constructions and the metaphor maps im-
plemented). Once the program was developed, we tested its applicability and performance
on a few previously unseen \ test stories". Chapter 9 describes results on both sets of
discourse fragments. Along the way we try to indicate additions or modications that were
required to process new stories. In general, the results are encouraging and show that our
architecture can model important aspects of the usage of motion words in describing poli-
cies and events in discourse about international economics. One interesting and potentially
important feature that emerged during program development was that the use of embodied
terms (with their easily accessible emotional content) often communicates important dis-
course information, such as the speaker's evaluation of a policy or his attitude towards a
specic actor. Even our limited implementation produced some interesting results in this
area, and it is clear that further systematic study both from corpus linguistic research as
well as from the computational modeling of discourse is likely to be highly productive. We
CHAPTER 1. INTRODUCTION 14
believe that our results show that embodied term usage is often the most felicitous method
for communicating information about dynamically changing goals, resources, and intentions
in a complex and uncertain environment. Therefore, a dynamic representation of embodied
inference seems crucial to pick up these features and project them onto abstract domains
like international economics.
Chapter 10 concludes with a summary of the basic results and some ongoing and
proposed work. Chapter 11 outlines some of the program parameters and database of
stories. Chapter 12 and Chapter 13 show the program behavior for selected examples from
the database.
1.1 Contributions
This thesis demonstrates through computer simulation that the high degree of
context-sensitivity inherent in a dynamic representation of the semantics of motion and
manipulation terms is routinely utilized to specify control, monitoring, resource and goal-
based information about complex plans and processes that operate in an uncertain and
dynamic environment. Our model shows the capability of making these discourse infer-
ences in real-time, consistent with the fact that such information is routinely available as
reex, automatic inference in narrative understanding. Outlined below are the central novel
aspects that emerge from the work described in this thesis.
A dynamic system model of motion and manipulation verbs that is inspired by high-
level motor control. A result of our representation of verb-complexes is that the same
representation can be used for planning and control and for reasoning about action
descriptions.
A computational model that demonstrates through simulation that a compositional
semantics of linguistic aspect can be constructed from schematized generalizations
recurring in process monitoring and control. This result is interesting because aspect
is a well studied and seemingly intractable problem in linguistics.
A computational model of metaphoric reasoning about event descriptions. A crucial
aspect of the model is its capability to exploit domain knowledge of spatial motion
and manipulation for real-time simulative inference. Results of applying our model
CHAPTER 1. INTRODUCTION 15
Chapter 2
These elementary facts about the brain have serious implications for language
processing. An early example (Feldman & Ballard 1982) of a serious processing constraint
is the 100 step rule. A direct consequence of the slow neuron operating times is that typical
reaction times are only about about 100 times the average neuron operation time. The 100
step (number of unit operations) constraint implies that much of language processing must
be done in a highly parallel fashion, relying on the high connectivity and active nature of
neurons. The high reaction time further suggests that the correspondence between brain
structure and function must be direct, there is no time to reinterpret structure dierently for
dierent functions since any re-interpretation has to be done by the same slow neurons. In
summary, uncontroversial (relatively) facts about brain function taken together suggest that
a) memory is active (no separate interpretation process) and b) a single single brain structure
does representation, inference, action and learning and c) while there is specialization there
is no evidence to postulate an autonomous, isolated module in the brain.
Study of the role of brain constraints in questions of higher-level cognition have
motivated the eld of Structured Connectionism since the early-eighties (Feldman & Bal-
lard 1982). The basic idea was to create modeling frameworks that respected important
neural constraints such as the ones mentioned above to investigate computational models of
cognitive function. Importantly, structured connectionist models attempt to abstract from
the details of brain neuro-chemistry to important known information processing constraints
that reect both strengths and weaknesses of brain like processing.
In structured connectionist models, a unit corresponds to a small number of ide-
alized neurons, a link corresponds to an idealized synaptic connection. Computationally,
this usually translates to the following computational constraints.
Clearly, one can easily come up with a wide range of expressions that articulate
the mappings above. Take the example of obstacles to motion being projected as plan
diculties. One routinely nds expressions like faced a brick wall, jungle of regulations,
weighed down, uphill,
nd loopholes, etc. that use this projection to specify aspects of
complex plans with embodied terms. Chapter 8 makes use of this and other mappings in
our computational model of metaphoric projections of embodied events and actions.
Recent linguistic work by Joe Grady (Grady 1996) suggests that large complex
mappings such as the Event Structure map outlined above are best conceptualized as com-
positions of more primitive maps. Central to his argument is that these primitive maps are
embodied or based in direct experiential correlations. Examples of such embodied maps
may be Destination ) Goal, or More ) Up, or Fall ) Fail. Grady further argues that
a key benet of viewing complex metaphors as compositions of embodied maps are that
some vexing problems regarding incomplete mappings and target domain overrides (certain
CHAPTER 2. LANGUAGE AND EMBODIMENT: A COMPUTATIONAL APPROACH21
entities are not mapped because of target domain constraints) go away if the embodiment
thesis is taken seriously. This obviously suggests that the set of embodied mappings and
the constraints of embodiment are crucial for computational models of metaphor use and
extension.
This thesis investigates the overall hypothesis for a small segment of language,
namely how active representations of motion and manipulation words (developed in Chap-
ter 3 and called x-schemas) based contain features that may directly serve as semantics
for describing abstract actions and events. To this end, Chapter 5 shows how the active
x-schema representations developed in Chapter 3 can model the semantics of linguistic as-
pect. Chapter 8 describes a computational model that is designed to study how metaphoric
projections of motion and manipulation words may provide important information about in-
ternational economic discourses. Chapter 9 outlines our model's performance on discourse
fragments reecting motion term usage in newspaper stories on international economics.
A parallel dissertation project (Bailey 1997) demonstrates how x-schema like representa-
CHAPTER 2. LANGUAGE AND EMBODIMENT: A COMPUTATIONAL APPROACH22
tions provide sucient inductive bias to make the problem of hand-action verb acquisition
tractable.
The NTL hypothesis suggests structured connectionism as the appropriate frame-
work to investigate the role of embodiment in language acquisition and use. However, rather
than directly express the language phenomena studied in this thesis (aspect and metaphor)
directly in connectionist terms, we found it more convenient to work at a more abstract
level, which we will call the computational level. We do require models expressed at the
computational level to be reduced to the connectionist level. This way, the computational
level serves as a convenient language for model development and testing, allowing human
modelers to understand and communicate model features and results. Detailed simulations
and predictions of our cognitive models may require using the reductions to the connection-
ist level.
The main mechanisms developed in this thesis, namely x-schemas and metaphor
maps are expressed at the computational level. In Chapter 4, we show that the reduction
from x-schemas to their connectionist realization. We are also able to model standard con-
nectionist primitives as x-schemas. Given that we have a compositional theory of x-schemas
(ref. Chapter 6), we think that x-schemas (which are representatable as Stochastic Petri
Nets (ref. Chapter 3)) may be good candidates for formal representations of structured
connectionist theories. If this turns out to be true, we can avail of the vast array of avail-
able algorithms for structural and dynamic analysis to answer questions of scalability and
dynamic behavior of our models.
While x-schemas have been reduced to structured connectionist models, our model
of metaphor interpretation (Chapter 8) has not been reduced to the connectionist level.
While we can see our way through to an implementation of metaphor maps as connectionist
networks, the use of Belief nets to represent knowledge of international economics and the
associated inference algorithms to compute the global impact of network updates present
a more serious challenge for connectionist implementation. We don't yet know of a good
way to accomplish this reduction, and this is a shortcoming of the thesis that we will try
to remedy in future work. Chapter 8 (Section 8.6.2) outlines some initial thoughts in this
regard.
We are well aware that computational perspectives and information processing
abstractions of brain function (such as connectionist models) may fall far short of addressing
CHAPTER 2. LANGUAGE AND EMBODIMENT: A COMPUTATIONAL APPROACH23
Chapter 3
corresponds to the rst part of the Neural Theory of Language conjecture ( NTL ) outlined
in Chapter 2. While the NTL conjecture applies to a whole variety of embodied concepts,
this Chapter will be concerned with providing evidence for the NTL conjecture as applied
to embodied action and event descriptions.
In the case where the word labels an intentional motor action, our representa-
tional mechanism is able to model the high-level action monitoring and control required to
carry out the action (in a command obeying mode). The representation is partly inspired
by results in high-level cortical motor control schemas (Sternberg 1978 Bernstein 1967),
leading us to refer to our verb model as executing schemas or x-schemas. X-schemas are
parameterized routines with internal state that execute when invoked. Linguistic input such
as the verb label, path, manner, and aspect provides information and supplies parameters
required for the initiation and control of the underlying x-schema. Thus x-schemas are dy-
namic, ne-grained, distributed action controllers that tightly couple action and reaction in
an uncertain and rapidly changing environment. Our x-schema based model of motion and
manipulation verbs clearly suggests that the semantics of such verbs is inherently dynamic
in this sense.
X-schemas are executing graphical structures that can be used for planning and
monitoring an action or to provide real-time simulative inference. The version of x-schemas
that we are currently using is an extension of a formalism known as Petri nets in computer
science (Reisig 1985). Their most important features are the ability to model both events
and states in a distributed system, clean ways to capture concurrency and event-based asyn-
chronous control in addition to the control primitives of sequence, conicts and decisions.
Our extensions include the addition of hierarchical control, parameterization and stochas-
ticity. We state and prove a theorem establishing the formal connection between x-schemas
and Petri nets, which allows us to tap into the vast literature on Petri nets for specication,
design and analysis of x-schemas. We believe that x-schemas provide a good formalism for
modeling many important issues in embodied cognition.
The rest of this chapter provides some motivation for our choice of representation
both from research in biological control theory as well as from mobile robotics and AI.
We then outline some computational requirements of x-schemas and our formal model of
x-schemas with examples.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 26
Researchers have also investigated the role of aerent signals in the walking sys-
tem of the cat, where a sensory signal at the end of the stance phase switches the motor
program from stance to swing (Grillner 1985). (Pearson 1993) reviews the various results
and hypothesis regarding the role of aerent feedback in vertebrate and invertebrate motor
control. He concludes that the basic reason for aerent feedback is to produce eective
movements motor output must be synchronized and coordinated with ongoing positions,
forces and movements in peripheral structures. Such feedback signals are important for
establishing the execution trajectory of motor programs they may reinforce ongoing motor
activity or conversely help switch from one synergy to another.
Tanji and Shima (Tanji and Shima 1994) have found that the cerebral cortex of
monkeys contains cells whose activity is exclusively related to a sequence of coordinated
multiple movements. They found such activity in the supplementary motor cortex (1, 2)
and hypothesize that these cells contribute a signal about the control of multiple anticipated
movements for planning several movements ahead. Recent evidence further suggests that
some motor areas involved in planning motor sequences are active even when actions are only
thought about, including mental imagery or imitative imagination (Gallese 1996 Grafton
1996). This result suggests the possibility of a single mechanism for high-level motor control
and action-verb semantics.
All the research above serves as an inspiration for the model described here. It
must be noted that the literature on motor control is vast and often controversial since the
issues are often extremely delicate and complex. From a biological standpoint, it can be
argued that understanding motor control is at least as important as understanding how
language works, since it aects all organisms. In this thesis, we argue that postulating an
explicit connection between language and certain aspects of high-level motor control and
planning allows us to look at converging constraints from both elds (and their numerous
associated sub-elds and branches). While the work reported here mostly attempts to argue
for the dynamic and active nature of action representations in language understanding, in
Chapter 5 we will see how aspectual distinctions made in language point to a richer and
more ne-grained representation of events than is normal in classical theories of actions.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 28
Schemas (Lako 1987). Specically, Mark Johnson argues (Johnson 1987) that schemas
need to be dynamic patterns rather than xed or static images, and that schemas are
\uid" and \malleable" structures that are capable of imaginative reasoning rather than
frame-like templates with slots needing to be lled in. Thus far, the work in Cognitive
Semantics has lacked computational models for such theories, and consequently such ideas
cannot currently be used in natural language understanding or problem solving systems.
Part of the work outlined in this thesis, including our x-schema based representation, tries
to provide a dynamic system framework to investigate issues of cognitive semantics. As we
continue, we will make explicit the connection to existing theories in this area.
destination rate
co- Vision
ord
Posture
co-
Visual ok
ord
bypass test
test fail
move test foot
until destination
From the discussion above, we conclude that x-schemas must satisfy a number of
computational requirements.
The state of an x-schema is dened by its marking. The ring rule produces a
change in the state of an x-schema by taking it from one marking to the next. Given an
x-schema and a marking, we can execute the x-schema by successive transition rings. This
can continue as long as there is at least one enabled transition that can re. The x-schema
ring rule semantics allows enabled transitions to re in a completely distributed manner
without any global clocks or central controllers.3 Execution halts at the state where there
is no enabled transition. This naturally allows us to extend the earlier denition to dene
an extended next-state function for x-schemas.
In all further discussion, we will use the extended next-state function to simulate
x-schema evolutions. As an example, consider the actual simulation starting from the initial
conguration in Figure 3.2 to the nal deadlocked (no transition is enabled) condition. The
basic x-schema shown in Figure 3.2 shows the application of the next-state function to
simulate x-schema evolution.
Figure 3.2 shows a very simple x-schema with a single event/action A with dura-
tion 1 time unit ( D = 1 ), three Input Places, a resource place R , an enable place, E , and
3
However our sequential simulation adjusts step size to be able to re multiple enabled transitions in a
single step.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 35
R 2 O R 5 O
3 3
A A
E E
D =1
I I
T=1 T=2
R 5 3 O R 5 3 O
A
E E
I I
T=3 T=4
R 2 3 O R 2 3 O
A A
E E
I I
T=4 T=5
Figure 3.2: Executing a basic x-schema. The bold lines on arcs indicate that token ow
is possible on that specic arc. The amount owing will depend on the weight of the arc.
Bold lines on transitions and input arcs (arcs entering transitions) imply that the specic
transition is enabled, while bold lines on transitions and outgoing arcs imply token transfer
to the output places. The darkened transitions at T = 4 corresponds to the ring of A .
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 36
an inhibitory place, I . There is single output place, O . The weight of resource input arc
is 3 , representing the requirement that at least 3 units of the resource be present before
the action is enabled.
In the graphical representation of x-schemas, Places are drawn as circles, and
Transitions as boxes. If a Place p is marked with the integer k , we say that the \Place
p is marked with k tokens." Graphically, this is illustrated by the presence of a black
dot with the integer k alongside the relevant Place p . Whenever a specic token type is
required by a transition from a specic place, or when a specic token type is present at
the input place to a transition, it will be indicated by the appropriate value.
As shown in Figure 3.2, the initial situation is one where there is no inhibitory
input ( I = 0 ), and the enable input is marked ( E = 1 ) and the resource input ( R = 2 ).
This is insucient for the action A , so the corresponding transition is not enabled. This
is the situation at time T = 1 . At this point (asynchronously), by some magic three more
units of the required resource are produced. This is sucient to enable the transition at
time T = 2 . The duration of 1 time unit keeps the transition enabled at time T = 3 until
it res at time T = 4 . In the current implementation, ring takes place instantaneously
(but one could have enable-to-re and ring durations as well). Executing the next state
function using the ring rule above takes the x-schema to a new state shown in the bottom
row of Figure 3.2. The new state at time T = 4 shows the depletion of 3 units of resource
R , in addition to marking the output O . At this point, the resource R again drops to
below the required level, and the transition returns to the initial state of being inactive or
unenabled.
Example 2 shows the actual simulation output that results from applying the
extended next-state function.
Example 2 Simulating the basic x-schema in Figure 3.2 The basic x-schema shown
in Figure 3.2 has 4 places and one transition. The places are labeled R for Resource,
E for Enable, I for Inhibit and O for Output. In the x-schema simulation display, each
place is mapped to a column the table. This mapping is usually xed at the beginning of
the simulation and is the \x-schema place to column mapping" below. In the simulation
of Figure 3.2, column 1 in the table below corresponds to the Resource R , column 2
corresponds to the Enable place E , etc. Each row is labeled by a time and a marking. A
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 37
marking is a state vector which shows token distribution over the x-schema. For instance
the marking M0 below corresponds to the vector !2100]T , signifying a state where there
are 2 units of resource R , the enable input place to the transition E is marked (has a
token), and there are no tokens in the inhibitory or output places. Similarly, the marking
M1 corresponds to the vector !5100]T , where the number of resource units has increased to
5 while the rest of the vector is similar to M0 . Once this happens, note that the transition
labeled Event is Enabled. The simulation output for these and future time steps is shown
below.
x-schema place to column mapping:
1 = Resource
2 = Enable
3 = Inhibit
4 = Output
States:
Time Marking 1 2 3 4 | Transitions to...
================================= | -----------------
T=1 M0 | 2 1 0 0 | ( )
T=2 M1 | 5 1 0 0 | ( M3 ) (Label: Event) (Status: Enabled)
T=3 M2 | 5 1 0 0 | ( M3 ) (Label: Event) (Status: Enabled)
T=4 M3 | 2 1 0 1 | ( )
T=5 DEADLOCKED=========================
Ta Ab Tb
Aa Ra Rb
Bb Ba
a) b)
Parameterization
T
A a x =y
b <x> R
<a, b>
<x, y>
b <y>
B
c)
Figure 3.3: X-schema tokens are typed to represent run-time parameters and bindings. This
allows a large net to be collapsed into a much smaller net that allows run-time binding.
preconditions to the transition Tb and Rb is the output place of the transition(Figure 3.3
b)). Transition Ta is enabled (in Figure 3.3 a)) since it has tokens in all its input places
(no inhibitory place is shown to keep the exposition simple). Firing the transition (the
event occurs) implies that the output Ra would be augmented with a token. On the other
hand, in Figure 3.3 b) the transition Tb is not enabled, since there is no token in the place
entitled Ba . Were it to be enabled then ring it would result in a token as the Rb place.
Collapsing these nets into a typed net results in the typed network (called as High Level
Net or Colored Net in the Petri net literature) shown in Figure 3.3 c). Here we have tokens
that can be of type a , b , or < a b > . The place entitled A has tokens that can be of
type a or b as does the place B . The place R has tuples of the form < x y > (where
x, and y are variables). The initial state from Figure 3.3 a) and b) are represented as the
token types (shown by their type value) at each of the input places. Firing the transition in
this state results in the tuple < a b > to be added to the place R augmenting the current
marking. This is the situation shown in Figure 3.3 c).
In Section 3.10 (Lemma 1), we formally show that High Level nets (such as the
one in Figure 3.3 c) do not add any representational power (as long as the number of
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 39
types are nite) by using a well known result from Petri net theory that shows that any
net with a nite number of types can always be unfolded into an ordinary net by adding
special places for type combinations. Crucially, the extension to individual tokens preserves
the liveness (if a transition could be enabled in the typed (ordinary) net it would in the
equivalent ordinary (typed) network) and reachability (any state that can be reached in the
typed (ordinary) network can also be reached in the equivalent ordinary (typed) network)
properties of the underlying Petri net of the x-schema. So in eect, using token types (or
run time parameters) greatly reduces the size of the net required while preserving important
dynamic and structural properties of our representation.
The second x-schema extension pertains to the dierent types of transitions and
ring rules allowed. Figure 3.4 shows the three types of x-schema transitions, namely
stochastic transitions, durative transitions, and instantaneous transitions.
Transition Graphic Firing Function
1
P(Fire)
Stochastic
Enabled
TIME
Durative D
D
Enabled Firing
TIME
Instantaneous
TIME
integer which varies depending on the nature of the transition. Instantaneous transitions
re without any delay after being enabled ( D = 0 ). In addition, we will see in Figure 3.5,
how we can use these primitive transition types to implement a hierarchical transitions that
correspond to the activation and execution of a subnet.
Third, input arcs onto the individual transitions are typed with the following types,
1. Enable arcs correspond to preconditons of an action. Such arcs have a weight (also
called multiplicity) of 1 . Executing an action (ring a transition) leaves the marking
of an enable arc unchanged. Firings along enable arcs only occur if the output of the
transition is unmarked (such rings are called contact-free in the Petri net literature
(Reisig 1985).)
2. Consume arcs correspond to resources that are consumed during execution. The
amount consumed corresponds to the multiplicity of the corresponding arc. Consume
arcs are standard Petri net arcs.
3. With the introduction of inhibitor arcs, a transition t is enabled when all of its non-
inhibitory inputs Is are marked with #(Is ) wst and places with inhibitory arcs
onto T are empty. In general, introduction of inhibitory arcs can potentially increase
the modeling power of x-schemas to be that of Turing machines with a corresponding
decrease in decision power however, an x-schema with inhibitory arcs can always be
transformed into an equivalent ordinary net without inhibitor arcs. This is possible
because the underlying Petri net of an x-schema is bounded. The reader is referred
to Section 3.10 for the proof.
The following theorem establishes the formal equivalence between x-schemas and
Petri nets.
Section 3.10 details the proof of the theorem. The theorem is important to our
eort since Petri nets are one of the most popular and well studied computational formalisms
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 41
for specifying, modeling, and analyzing highly distributed and concurrent systems (Murata
1989). Algorithms and analysis techniques from that literature can directly be brought in to
our work. For instance, to model hierarchical action sets with variables and parameters, we
extend the basic model to allow tokens to carry variable binding information (i.e. they are
individuated and typed). The expressive power remains unchanged since it is well known
in Petri net theory that a net with a nite set of colors can be unfolded into one with a
single color. Furthermore, the bounded nature of the underlying network ensures that the
reachability graph of the marked x-schema is nite.
Event
R O
Primitive
Action E
I
Begin Ongoing End
Hierarchy of
Primitive in out
Actions
Begin End
Concurrency
and Synchronization
Fork Join
Alternative
Primitive
Actions
Condition
Action
Repeat Action
Until Condition
Action1
Condition
If Condition
Action2
Then Action1
Else Action2
enabled, it can re leading to new marking where all output places and enable places are
marked as a result of the ring of the transition t 2 T .
Mi+1 = Mi ;
Xw Xw
kt (Rk ) + tj (Oj ) (3.1)
k j
Notice that the ring of transition t at the ith time step changes the marking
vector ( M ) from its old value of Mi to its new value Mi+1 (the additions and subtractions
above are vector operations). For instance, the marking vector in Figure 3.2 (ref. Exam-
ple 2) showed the eect of ring the transition Event with marking vector M2 = !5100]T ,
which resulted in the new vector M3 = !2101]T in accordance with the rule shown in
Equation 3.1.
The enable places remain unchanged from the old marking, while the appropriate
amount of resource tokens are consumed by the ring from all the resource input places,
and the appropriate amount of output tokens are deposited at all the output places of the
transition. As a side note, the mechanism generalizes over the strips action formalism
(Fikes & Nilsson 1971), in allowing measure uents such as resources to be modeled. Of
course, a central distinction between x-schemas and strips plans are that x-schemas are
dynamic systems that execute, and while one could run goal regression algorithms and other
types of theorem provers over x-schemas using them as declarative structural entities, it is
the dynamic properties of x-schemas that concern us in this work.
To model hierarchical actions, we use a pair of instantaneous transitions that
correspond to the beginning and end of the subnet. This is the situation shown in the second
example of Figure 3.5. Here the presence of a token at the begin input place (signifying the
beginning of subnet invocation) gets split into two, one token going to the ongoing place
signifying that the subnet in question is executing (allowing for time out procedures if
needed), and the other goes to the rst action/event of the subnet. When the subnet
completes execution, the nish transition is able to re (assuming that the token still exists
at the ongoing place). Note that this is an example of token splitting(forking) and token
merging (join) behavior shown as the third control behavior outlined in Figure 3.5.
Figure 3.5 also shows the representation of alternative actions available to the
agent at a given state. The simple template shows that the marking vector changes in an
identical fashion if either transition is red. Of course, there may be other side-eects which
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 44
may be unique to which action is performed (such as dierent amount of resources consumed
or dierent world conditions becoming true as a result of choosing a specic alternative),
which can be easily modeled as well. In addition, the x-schema model presented here is able
to model both conicts and decisions (both cases where taking a specic action (ring a
specic transition) may lead to widely diering next states).
The last two control behaviors pertain to the standard constructions of repeating a
behavior until a specic condition is reached, and conditional branching. In the rst case, a
transition continues to be enabled until the condition is true, in which case, it is inhibited
from ring. In the second case, the presence or absence of a condition activates or inhibits
the ring of the appropriate transition.
Start Finish
a) Durative
G
R
w f) Goal Based Enabling
k
Start Finish
b) Resource consuming
R
p
Figure 3.6: Modeling Motion Control Parameters. Here the hexagonal nodes on the right
of the gure correspond to sub-schemas (not shown). Bold transitions and input arcs imply
that the corresponding transition is enabled. Bold output arcs imply token transfer to the
output places of a ring transition (shown in bold).
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 46
One interesting thing to note from the discussion above is that from purely sensory-
motor arguments we made the case for two dierent modes of disabling an action, one of
explicit inhibition, and the other of consumption of a needed resource. If we abstract
this duality to reasoning in general and in particular to the semantics of negation, we
note that negation would be constructive in an x-schema based reasoning system, where it
is either possible to model negation as the explicit disabling of the negated entity or as
the consumption of a needed condition or resource. The implications to reasoning about
negation in general are not developed further in the research reported here, but is part of
our future work.
The issue of dynamic resource modeling is central for any autonomous agent to
function in a dynamic, uncertain environment. Requirements include being able to check
and monitor conditions and resource usage. These include monitoring resource consump-
tion (energy level), as well as respecting mutual exclusion or temporary locking constraints
(can't hold two blocks at the same time)), and enabling and disabling conditions (can't walk
if the ground is slippery). Agents should be capable of goal based enabling and should be
able to monitor and remain active until achievement of the goal. Figure 3.6 shows the
x-schema templates for resource consumption, production and locking. Note that amounts
of resources required for the action involved is modeled as the weight w ( w 2 N ) on the
resource input arc, and in the production case, the output arc weight p ( p 2 N ) repre-
sents the amount of resource produced as a result of the action. The top right template in
Figure 3.6 shows the model of goal-based enabling. The key thing to note here is that, the
relevant x-schema continues to be enabled as long as the goal is not satised (of course
if all other pre-conditions are satised and resources present). Once the x-schema nishes
execution, it marks the goal being satised, disabling the x-schema from further execution.
In future discussion, we refer to the abstracted primitives in Figure 3.5 and Figure 3.6
(control templates, duration, periodicity, resources, goals and conditions) as the process
primitives. In examples on the right hand side of Figure 3.6, we see the use of the start
and
nish places signifying control status of the underlying x-schema. The start transition
signies that the x-schema in question is active and has execute preparatory steps to a
state of readiness. The
nish transition signies that the completion of x-schema execution
leading to the state where the x-schema is done . These transitions are part of an underly-
ing motor-control regularity that can be found in all motor programs. The implications of
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 47
this statement are fully explored in Chapter 5 when we use this regularity in modeling the
semantics of aspect.
From the cases above and the discussion to follow, the reader should crucially
get the notion of activation and inhibition as being the central x-schema modeling pa-
rameters. Together with another primitive which we will introduce in Chapter 5 (for the
curious, the other primitive corresponds to gating or execution trajectory modication),
we will be in a position to formally dene how to construct compositional domain theories
using x-schemas in a way that preserves the active and dynamic properties of x-schemas.
Such a compositional theory will form the backbone of a simulative inference mechanism,
where real-time inferences are made through x-schema executions. But that will have to
wait till Chapter 6.
If one looks at the variations of walk related words and expressions in English,
one realizes that words code for fairly detailed dynamics of walk. Parameters include the
destination, rate of motion, cautiousness and monitoring, step size, forcefulness of the stance
or swing phases, direction of step or walk, purposefulness (or lack) of the walk, energy level
and motivatedness, intention, control or loss of control (fall), resistance provided by the
environment (such slipperiness of the ground or unexpected encounter of a bump). Many of
these primitives are the same as the process control parameters in Figure 3.5 and Figure 3.6.
Many of the dierences in the words above can therefore be captured in an x-schema based
representation of a parameterized walk controller.
unstable ground is quite dierent from walking on rm terrain. The information and world
condition parameters that may impact x-schema executions is referred to as the world
state. The third set of features that modify x-schema executions are intentional and moti-
vational features. In our work, we found goals, energy levels, and aective states (obviously
correlated to energy and goals) important execution parameters for x-schemas. Walking
when tired is obviously more labored and dierent than walking with high energy.4 We
will jointly refer to such internal state features as internal state. So in eect the f{struct
consists of the three kinds of features described above. Conceptually, such a distinction
is important since it allows for words and lexical items to be coded using a small set of
features under the assumption that the context-sensitive nature of the underlying x-schema
representation will take care of the pragmatic issues in interpretation. This can potentially
help both in lexical acquisition and lexicon extension (ref. Section 3.6).
Figure 3.7 shows an example of encoding the high level walk controller in Figure 3.1
as an x-schema. Circles in the gure correspond to the various states of the x-schema,
hexagons correspond to hierarchical actions as described in Figure 3.5. Rectangles are
single actions/events represented as transitions. We assume the step to be of duration
specied by the rate parameter, and step size to be covering the distance specied by the
size parameter. For the simulation shown in Figure 3.8, we set the step duration to be 5
time units and other conditions and tests to be of 1 time unit (including actions such as
test footing and move test foot ).
Figure 3.7 shows the situation where the walk x-schema is enabled. If enabled,
the presence of the colored (typed) token ok signies that vision and postural tests return
an ok status. This res the transition that coordinates visual and postural information and
results in the vision ok place being marked with a token. In parallel, ring the transition
labeled co ; ord on the top of Figure 3.7 places a token at the ready place of the walk x-
schema, signifying that the agent is ready to walk, and that the ground ahead is ne. This is
the situation shown in Figure 3.7. At this point, the presence of tokens at both input places
to the transition enabled bypass ok (center of Figure 3.7, and an instantaneous transition)
allows the walk x-schema to bypass the test footing test and enable the step transition.
After a step this cycle is repeated. Note, any time the places entitled V ision or Posture
return a not ok status, the vision ok place will no more be enabled, and the bypass ok
4
And indeed some words like sluggish or tired may code specically for these parameters.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 50
F-STRUCT
Name Agent Rate Dest Ssize Sdir Aspect Ground Energy D(dest1)
Walk Self rate1 dest1 Med Forward Progressive Stable 44 3 43210
<ok> <ok>
co- Vision
Vision ord Done
ok
<ok>
<ok>
Posture
Enabled co-
ord bypass
test
until(dest1)
Figure 3.7: The walk x-schema shown with the motor control parameters, world state
parameters and motivational and intentional parameters jointly held by the f{struct. Bi-
directional interactions (a few shown as dotted lines) set, get and modify f{struct values
and x-schema executions. For example, the distance to goal is shown decreasing as a result
of x-schema executions, while the world condition pertaining to stability of ground modies
x-schema executions (to test or not to test footing) as does explicit specication of motor
control parameters (such as rate, destination, and step size).
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 51
GR = NOT(STABLE)
AGENT=SELF
VIS_NOT_OK
ENERGY
TEST_FOOT
MOVE_FOOT
STEP
DIST(DEST1)
READY
ENABLED
AT_DEST
DONE
TIME
Figure 3.8: Executing the walk x-schema. The walk x-schema is a dynamic system which
is able to respond to world state and linguistic input. Shown in the gure, is the Schema
executing with a specic energy level and a specic World state. As long as the visual test
is ok , the test foot branch is not taken. When in the third cycle (x-axis corresponds to the
number of cycles through the until loop), test footing is not used. When the ground is not
stable ahead, visual test returns :ok (shown as dark rectangles) and the test foot branch
is taken. Note the dynamically depleting energy resource and the dynamically changing
uent representing the distance to goal.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 52
transition will no longer re, and the test footing test will have to be undergone. Note also
that the responsiveness of the system as shown in Figure 3.7 is limited to the granularity
of a step , since if the vision or postural status changes within a step, it will not aect
performance till the next step. For tighter responsiveness, we need more circuitry. For
instance, being able to stabilize upon encountering an unanticipated bump while stepping.
Our design of x-schemas allows us to model this degree of responsiveness and failure recovery
as well, but we will have to wait till Chapter 6 to see exactly how this is done.
Figure 3.8 shows the changing marking as the x-schema executes. Note speci-
cally that states (such as the Agent of the walk being bound to self) persist indenitely. Also
world conditions may change during execution, modifying the trajectory appropriately. In
Figure 3.8, actions and control and perceptual status are shown as dark pulses, while world
conditions are shown shaded. As an example, we see how the ground ahead changes from
being stable to unstable (depicted as a shaded rectangle for the cycle of interest). This re-
sults in the visual test returning a :ok status leading to the agent taking the test footing
branch, which may return fail status (shown in Figure 3.8) leading the agent to move-foot
and repeat the test with a dierent location. At this point, the world removes the obstacle
(asynchronously), and the x-schema is able to respond appropriately by returning visual
status ok and thus taking the next step. Actions have dierent durations, which can be
controlled by the rate parameter (in Figure 3.8 we used a duration of 5 time units for the
step and one each for the other actions). We also see the model of dierent actions dy-
namically changing resource levels, such as energy being consumed, as well as the results
of x-schema executions changing world state such as decreasing the distance to destination.
Some readers may already have noticed a certain pattern to the control status
information ready , enabled and done . In fact, it turns out that such information is rou-
tinely coded using grammatical and lexical devices, allowing a speaker to focus on specic
sub-phases of an action or process. This phenomenon is widespread across languages and
an object of study since Aristotle and is known in linguistics as the problem of aspect.
X-schema based dynamic system models seem to provide a novel model of aspectual inter-
pretation that avoids some well known paradoxes in other accounts and appears to provide
a compositional semantics of aspect. Chapter 5 explores these issues in detail.
To reiterate, linguistic input only codes for specic features required by the x-
schema. In all these cases, we assume the existence of a parser, that can provide partial
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 53
parses by instantiating features of the f{struct to specic values. The assumption holds
for the rest of the work reported in this thesis as well. Other parameters may come from
perceptual or world state (such as the assertion of a stable ground) or from internal mo-
tivational and attitudinal states (such as the energy level in Figure 3.7). Thus, the set
of features coded for by linguistic input, while constrained by the parameters required for
x-schema execution, is still potentially quite large, and can give rise to distinctions in man-
ner, path, rates, aspect, and other well known linguistic distinctions. The x-schema model
for interpretation always executes the linguistic information provided by lexical items and
grammatical devices in the context of the agent's internal state as well as the state of the
world. Thus interpretation is always context-sensitive in this manner.
Modifying the parameters to the walk schema allows the same structure to model
dierent behaviors. Consider the case of the linguistic input Tiny Step shown in Figure 3.9.
Here, the input label tiny codes for the size of the step, while the input step codes for the
specic x-schema walk as well as the specic sub-schema step . The basic idea is that
the input words may refer to a specic x-schema or a sub-schema. Note that the referential
possibilities bottom out at the appropriate synergies (or atomic actions), so in this scheme
it would be tough to learn or describe in language the internals of a reaching behavior (such
as specic joint angles or torques), though one could of course learn these internals through
practice, demonstration, imitation or a variety of other skill-learning techniques.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 54
Tiny Step
WORD SENSE WORD SENSE
Size Schema sub-schema
v. small(1) Walk Step
F-STRUCT
PARSING PROCESS
Name Agent Rate Dest Ssize Sdir Aspect Ground Energy
Walk Self Unkn 1 Forward Progressive Stable 3
BINDING
walk(self,,unknown,1,forw, prog)
<ok> <ok>
co- Vision
Vision ord
ok
<ok>
<ok>
Posture
Enabled co-
ord bypass
test
step(med,forw)
Ready Test footing
until(dest1)
Figure 3.9: Interpreting the input Take Tiny Steps. Verb Phrases specify dierent initiation
and execution parameters to x-schemas. In this case, the word label tiny is translated to
executing the walk x-schema with a tiny step size. Note that some of the parameters
are left unspecied, and others take on default values. For instance, the rate parameter
is unspecied at the input, but will be interpreted as a result of x-schema execution. The
default destination is unknown, and the other default values are as shown in the gure.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 55
Example 3 shows the actual simulation for the tiny steps input shown in Figure 3.9.
The crucial thing to notice in the simulation is the dynamic nature of the interpretation,
where the actual walk parameters are changing as the simulation proceeds. The simulation
is obviously highly-responsive to changing world and motivational state context, exhibiting
the context-sensitivity required for command interpretation in a dynamic environment.
While tiny obviously codes for a small sized step, Example 4 shows the interpreta-
tion for the input Giant Steps. Here the key thing to note is that since the size of the step is
very large, the energy required per-step is also extremely high, a fact reected in the sim-
ulation, which consumes 7 units of energy per-step in this case, as opposed to the 1 unit
consumed in the earlier case. In this way correlations between features in the f{struct are
captured in the relevant x-schemas as compiled knowledge used in execution. In Chapter 7,
we will see another method for capturing correlations is used when the knowledge is not
compiled directly.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 57
1 2 3 4 5 6 7 8 9
As another example, consider the case where the step size is specied as large , cor-
responding to a large step. This would correspond to taking a large step, covering a greater
than normal distance per step and requiring fewer iterations to reach the destination.5
Similarly the same x-schema is capable of exhibiting dierent types of steps (including
sidestep , where the direction of the step away orthogonal to the destination), cautious
steps (where there is a test footing before every step), etc.
By now it it should be clear how our model is capable of handling adverbs like
quickly or slowly as well as other lexical variants such as amble or saunter (coding for
direction (or lack thereof), rate and other manner distinctions). Destinations and paths are
coded through the prepositional attachments to the phrase as shown in Figure 3.10.
Figure 3.10 shows the case where the same x-schema as before is used to model
walking to the store. The individual word senses are not shown partly for ease of exposition
and partly because the parsing process which results in instantiating the f{struct is not mod-
eled anyway. In this case, the run-time binding destination gets bound to store at runtime,
and the condition check at the bottom loop changes from until(dest1) to until(at(store)) .
Figure 3.5 showed how the
repeat <action> until <condition>
control works. Of course, if the destination specication is not in the input utterance, or
unknown (as was the case with Tiny Step), the dest1 is not specied, at at dest (ref.
Figure 3.7) is never marked, and the x-schema continues execution forever, unless some
other parameter like number of steps (as in Take 3 tiny steps) species the end state or
energy runs out, or the walk x-schema is otherwise disabled. We will see these types of
cases in Chapter 6.
As an example of how x-schemas could be used for simulative inference just as
readily as command-obeying, consider Figure 3.10 which shows the walk schema inter-
preting the sentence John is walking quickly toward the store. Here, the individual words
5
In the implementation as it stands, the size parameter is restricted to be one of seven values 7
indicating a giant sized step, 1 indicating a very small sized step.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 59
<ok> <ok>
co- Vision
Vision ord
ok
<ok>
<ok>
Posture
Enabled co-
ord bypass
test
step(med,forw)
Ready Test footing
until(dest1)
Figure 3.10: Interpreting the input John is walking quickly toward the store. Verb Phrases
specify dierent initiation and execution parameters to x-schemas. In this case, the word
label quickly is translated to executing the walk x-schema with high rate parameter, etc.
Note how dierent components of the input utterance supply dierent execution parameters,
such as the prepositional phrase supplying the destination, the adverbial modier the rate,
etc.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 60
in the sub-categorization frame of walk code for schema name (walk), agent (John), rate
(quickly), destination (store), direction = forward (toward destination), and aspect (present
progressive) (from the be + V-ing construction). While, we have seen how the other pa-
rameters are modeled, we still have to see how the progressive specication of is walking is
handled by our model. That is the subject of Chapter 5.
Other expressions and words that use the walk schema setting dierent parame-
ters (including some aective and motivational states) include amble, saunter, giant steps,
cautious steps, measured steps, crawl, amble, trundle, trudge, and hike. In Chapter 8 and
Chapter 9 we will argue that some of the distinctions between these and other embodied
words point to crucial semantic features that are projected metaphorically onto the do-
main of abstract plans, events and actions such as discourses about international economic
policies.
Essentially, the projection algorithm is the application of the extended state func-
tion to the action set in S . Note that the model is easily able to handle both the projection
of explicit actions, as well as world evolutions (including ramications) which may occur
at any time during projection, giving rise to a new marking vector during execution. The
algorithm is linear in the number of actions in S and can be executed in O(S ) time. Note
that at each application of the next-state function multiple, concurrent transitions may
be enabled and red. This allows for a natural extension into modeling the behavior of
multiple agents in cooperative action and planning scenarios. However, this possibility will
not be explored further in the work reported here.
3.7.2 Reachability
Reachability is a fundamental problem of dynamic systems. In terms of x-schemas,
the problem is stated as follows.
Given that the state space of an x-schema evolves through execution of the x-
schema, we can dene the following entities that correspond to the reachable states of an
x-schema, given an initial marking.6
6
The reachability denitions are equivalent to the Petri net reachability, so the reader knowledgeable
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 63
Extending this concept, we can dene the set of reachable markings for a given
x-schema in some initial state. Basically, if sj is immediately reachable from si , and
sk is immediately reachable from sj , then sk is in the reachability set of si . Thus the
reachability relationship is the reexive transitive closure of the immediately reachable
relationship.
For general purpose nets the reachability algorithm has been shown to be decidable,
but to be PSPACE -complete. However, in the cases we are interested in (outlined in
Chapter 5), the occurrence sequence over a few time steps is sucient.
close connection between domain knowledge and interpretation, the uncontrolled nature of
forward inferences led to severe intractability and counter-intuitive results (Charniak 1976).
Later, Schank's group scaled back the eort and used the idea of Scripts, where in-
ferences were limited to narrow domains and text interpretation relied exclusively on stereo-
typed scenarios. This limitation proved too severe, and led to extensions that pertained to
recognizing the importance of goals and their complex interactions in understanding stories
(Wilensky 1983), and the need for dynamic memory assemblages (Kolodner 1984).
We share the basic premise inherent in conceptual dependency theory that sen-
tences are ultimately mapped to a semantic representation that generates inferences. We
also share many of the goals and guiding intuitions from the early work in conceptual de-
pendency, including the idea that a theory of actions should be useful both for planning
and language understanding. Such considerations naturally led us to a rich representation
of actions that is able to model the eects of dynamic enabling and disabling of goals,
and the eects of resource production and consumption. As (Wilensky 1983) convincingly
demonstrates, these are among the essential primitives for building story-understanding
systems. One distinguishing feature of our x-schema-based representation of events is that
they are more ne-grained than other proposals we have seen. This ner granularity re-
sulted from our interest in modeling aspectual distinctions made by dierent languages (ref.
Chapter 5). We also believe that the ner-grained nature of our representation of aspect
presents one natural way to constrain automatic inference in reasoning about event descrip-
tions. Additional evidence comes from the observation that metaphoric projection of events
across dierent domains appears to respect the temporal and aspectual distinctions made
by our representation. Our model is thus able to exploit these characteristics in metaphor
interpretation as shown in Chapter 9.
Another distinguishing feature of our representation (see Chapter 6 Section 6.3
for further details) is that x-schemas are executable graph-structures (bi-partite, cyclic
graphs). The graph-based representation allows us to formally state and reason about inter-
schema relations declaratively while using their real-time execution capability for inference.
This allows our representation to be used to declaratively to specication and design or
procedurally for projection and automatic inference. We believe this property to be essential
for representations that are to be used both for acting and for reasoning about action
descriptions.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 66
eector command. If the method is not primitive, the robot chooses one of the methods
for the task whose context is currently satised and adds that task network to the agenda.
When all subtasks in a specic task network are satised, the success criterion is checked.
If true, the task is done and the robot moves on to a new item on the agenda. If a task
fails, the method in which it was a subtask is removed completely, and the task remains on
the agenda.
Chapter 6 describes our compositional theory of x-schemas which bears some
strong resemblances to RAP-based systems. Individual x-schemas are similar to RAPS.
The triggering conditions for x-schema initiation and execution are similar to RAP con-
texts. The goal-based enabling and disabling mechanisms allow the x-schema based system
described in Chapter 6 to avoid sequential search through an agenda to evaluate success
conditions and remove completed tasks. All implemented RAP systems we have seen work
on one task at a time.8 In contrast, the x-schema composition machinery allows for mul-
tiple parallel tasks to be simultaneously active and their results and eects on the world
state create new goals and contexts. The price we pay is of course that analyzing x-schema
plans is likely to be harder than analyzing RAP plans. This is where the graph-based rep-
resentation and connection to Petri nets should help. Another dierence is that our ner
grained action representation allows for a fairly rich set of inter-schema control relations
to be declaratively specied and modeled. The reader is referred to Chapter 6, Section 6.3
for a detailed description of the set of inter-schema control relations available. In contrast,
while the central components of a RAP based model are easy to specify declaratively, de-
tailed control is often hidden inside a method and we suspect would be fairly hard to specify
formally within the language itself. As (Wilensky 1983) points out, a theory of plans that
can be used in language understanding should be able to reason about fairly complicated
resource and goal interactions. We take that to mean that such detailed information should
be available for theory specication and analysis, rather than hidden inside some procedure.
We believe the model in Chapter 6 to be consistent with this view.
In this connection, we believe that x-schemas provide a formal model of reactive
plans and behaviors. The resulting active model has a well specied real-time execution
8
This is not an inherent limitation of RAP, since the responsibility of tracking inter-task dependence
and status is placed on the scheduler. But putting this burden on a scheduler may not be the best overall
design since it will potentially slow down the responsiveness of the system while the scheduler checks if all
dependencies are maintained before searching the agenda for new tasks.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 70
semantics, and is capable of modeling hierarchical and concurrent actions and of action and
execution monitoring. Furthermore x-schemas can take into account serendipitous eects,
provide limited exibility in the face of exogenous events, and handle event-based interrupts.
X-schemas are capable of modeling atomic and durative actions, as well as uncertainty in
world transitions and action eects. Finally, as Chapter 4 shows, x-schemas have deep
connections to connectionist models of reasoning, allowing for a single representation to be
viewed as a plausible computational model of high-level, coordination-oriented control of
synergies, as well as a candidate representation for task-oriented robotic behaviors.
analysis and specication algorithms to x-schema design and analysis. Preliminary eorts
in comparing the two representations leads us to believe that a detailed analysis of the
relationship between T ; R programs and x-schemas will be highly productive.
3.9 Conclusion
X-schemas are a graphical and mathematical representation with ne-enough gran-
ularity to model asynchronous and concurrent action systems with interacting plans and
goals. The active nature of x-schemas allows them to be highly reactive and respond asyn-
chronously to changes in perceived or projected world state. The connection to Petri nets
allows the single system to simulate dynamic, stochastic, and concurrent activities as well
as allow for the setting up of state equations and other mathematical models to theoret-
ically study its structural and dynamic behavior. X-schema executions respect resource
constraints and directly model the production and consumption of resources. Furthermore,
x-schemas have a truly concurrent semantics which allows them to be natural candidates
for modeling multi-agent systems.
The thesis that motion and manipulation words refer to parameters that initiate
and supply control and monitoring parameters to the underlying action x-schema allows
us to use a uniform mechanism for action and interpretation. Our hypothesis naturally
results in a dynamic semantics for action verb phrases. While we are obviously partial to
our specic x-schema model and feel that the extended Petri net representation can capture
much of the needed functionality, our allegiance is ultimately to a larger class of models
that can exhibit the high-responsiveness and dynamic system characteristics involved and
that can be encoded in a structured connectionist framework. To this end, the next chapter
outlines our mapping from x-schemas to a popular and widely used structured connectionist
model. In the rest of the work reported in this thesis, we attempt to argue that an x-
schema like dynamic model is not only useful for interpreting expressions about motion
and manipulation words but is essential for context-sensitive interpretation and to generate
pragmatic discourse-level inferences about abstract event and action descriptions.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 72
3.10 Appendix A
Here we prove the equivalence between x-schemas and Petri nets. The rst part
is concerned about the non-timed case, while the second is concerned about the timed and
stochastic case. We begin with the following denitions.
We represent x-schemas as a modied class of high-level Petri Nets (Reisig 1985),
where the modications allow for stochastic transitions, as well as special compositional
states and transitions (decomposable to a an ordered collection of primitive states and
transitions), and run-time parameter binding. We will show that in the case where the set
of dierent objects to be bound to is nite (including nite parameter values), x-schemas
are equivalent to bounded ordinary Petri nets. We begin by dening the basic Petri net.
Denition 13 Bounded-net
Lemma 1 . HLPN's with nite number of types can be unfolded into a unique ordinary
net PN . Also Bounded HLPN ) Bounded PN.
Proof: A HLPN = (P T C I O M0) can be unfolded into an ordinary
Petri net PN in the following way:
1. 8p 2 P c 2 C (p) , we create a place (p c) in PN .
2. 8t 2 T c0 2 C (t) , we create a transition (t c0 ) in PN .
3. We dene the input ( I ) and output ( O ) function on PN as
From the discussion above we conclude that x-schemas are equivalent to ordinary
Petri nets without stochastic transitions.
The negative exponential distribution renders the reachability graph of the SPN
isomorphic to a continuous time Markov chain. The Markov chain MC can be obtained
from the reachability graph as follows: The MC state space is the reachability set R(PN )
of the marked SPN . In MC , the transition rate from Mi to Mj is given by qij = k ,
corresponding to the ring rate of the transition tk from Mi to Mj . If several transitions
lead from Mi to Mj , then qij is the sum of the rates of these transitions. If there is no
link from Mi to Mj in R(PN ) then qij = 0 in MC .
The steady state distribution of the MC is obtained by solving the linear
equations:
Q = 0 (3.11)
Xs =1
j
1=1
(3.12)
Denition 14 HLSPN
A HLSPN is a 3 ; tuple HLSPN = (HLPN, T, W) where
Denition 15 HLGSPN
A HLGSPN is a 4 ; tuple HLGSPN = (HLPN T1 T2 W ) where
HLPN = (P T C I O M0) is the underlying marked High-Level Petri net.
T1 T is a set of timed transitions T1 6= 0 .
T2 T is a set of immediate transitions.
T1 S T2 = T and T1 T T2 = 0 .
W = (w1 : : :wjT j)(wi 2 R) is an array whose entries
1. Could be a negative exponential distribution specifying the ring delay when
t 2 T1 . For such a timed transition, the rate could be marking dependent.
2. Is a ring weight for the immediate transition t , when t 2 T2 . This weight is
also possibly marking dependent.
If T2 = 0 , then the GSPN dened above essentially reduces to an SPN . Hence SPN
GSPN . The presence of immediate transitions makes the reachability graph of a marked
GSPN non-markovian since immediate transitions re in zero time. Markings where only
immediate transitions are enabled are called vanishing since an external observer will
never see these states, even though the stochastic process sometimes visits them. For timed
transitions, the stochastic process sojourns for an exponentially distributed amount of time,
and states resulting from markings enabling timed transitions are likely to be seen by an
external observer. These states or markings are called tangible markings.
CHAPTER 3. AN EXECUTABLE MODEL OF ACTION VERBS 78
Chapter 4
NO
in_sight APPLY
CONTACT FORCE
(obj1)
(dir, mag.)
maintain contact
PRESHAPE
maintain sight
muscle
commands
for applying
component for
force
applying force
muscle APPLY FORCE
(not specified)
commands X-Schema
for hand
shaping AF
SHAPING (2) ? + Perceptual/
X-Schema motor
component for reaching
orienting hand behavior
(not specified) REACH
X-Schema
? + ? +
RO
OH
to node
of ENABLE
ESTIMATE
MASS
X-Schema ? + READY
estimate
mass of object
Perceptual/
motor schema
for locating
? + object
EM
LOCATE
OBJECT
X-Schema
r1
LO + ?
r2
PUSH
from role node
X-Schema ? + in ENABLE
ENABLE Assert in
memory that
object of PUSH
SWITCH has been
pushed
Paramter node
Role node
? - + . . . ? - +
focal clusters in other X-schemas
that invoke the PUSH X-Schema
executed by the reach schema and a (not shown) component that eventually invokes the
appropriate low level motor commands to orient the hand. Once the hand is properly
oriented and in place to apply the force, the appropriate motor commands for applying
force are executed.
was introduced in the shruti inference engine (Shastri & Ajjanagadde 1993) and has since
been adopted to a wide range of problems (Hummel & Biederman 1993 Valiant 1996).
The fact that the same dynamic binding mechanisms can also apply to active schemas is
potentially of great importance. Neurophysiological evidence suggests that such propagation
of synchronous activity is neurally plausible (Singer 1993) and our latest results suggest
additional functions of the brain where it might be employed.
The ?, +, and { nodes in each cluster serve control and coordination functions. The
? node can be viewed as an initiate or query node. A process initiates an action (or poses a
query for information) by activating the ? node of the appropriate focal cluster. Each ? node
has a threshold (default = 1) which determines the number of active inputs it must receive in
order to re. The + and { nodes indicate the outcome of schema execution. The activation
of the + ({) node of a focal cluster by a schema indicates the successful (unsuccessful)
completion of the schema. This provides the conditional computation required for complex
schemas and is also used for error recovery. Observe that recurrent connections are a central
aspect of schemas. In particular, every cluster that initiates a schema also receives a signal
when the schema is completed.
Each x-schema has a head focal cluster { the enable cluster { that serves as the
point of initiation. The enable cluster of a shared schema has a switch which controls
the ow of signals into the schema. All schemas that invoke this shared schema have their
appropriate focal clusters connected to this switch (see bottom of Figure 4.2). We assume
that these inter-schema connections are learned as part of learning an automatic behavior
or a reactive plan.2 The switch behaves like a bidirectional k to 1 switch, where k is the
number of external schemas that are linked to this one. It ensures that at any given time,
signals from at most one schema can propagate into the enable cluster of the (invoked)
schema. This switch state is maintained until the invoked schema nishes execution (which
is signaled by the activation of the + or { node in enable).3 If several schemas try to
activate one schema simultaneously, the switching mechanism selects signals from one of
these schemas. The switch also channels the output of the + and { nodes from enable to
the appropriate links feeding back to the invoking schema.
Let us walk through the push schema (refer to Figure 4.2). Imagine that our
2
Connections between schemas have to be created dynamically during deliberative (reective) planning.
We are not concerned with such reective processes in this paper.
3
Preemption by a high priority schema is possible but not discussed here.
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 85
agent has the goal of vigorously pushing a red wagon. This leads to one of the x-schemas
connected to the push schema invoking it by activating the ? node in its enable cluster,
activating the parameter node in the cluster with a high ring rate (to indicate a high force
value), and synchronizing the ring of the role node in the cluster with the appropriate role
node of the invoking schema. Since the object of the action is the \red wagon", the this
role would be ring in synchrony with the focal nodes of the concepts \red" and \wagon"
and thereby be dynamically bound to the object description \red wagon". As a result, the
push x-schema is enabled with its force parameter set to high and its object role bound to
the description \red wagon".
The activity in enable of Figure 4.2 leads to the activation of the focal cluster lo
wherein the ? node becomes active and the role r1 synchronizes with the role in enable.
Consequently, the role r1 starts ring in synchrony with \red wagon" and hence, gets bound
to that description. lo in turn invokes the locate-object x-schema with the binding \red
wagon". Upon successful execution, locate-object activates the + node in lo and binds
the role r2 with the location where the object matching the description \red wagon" is
located. This binding is expressed by the role node ring in synchrony with the appropriate
\location" node in an egocentric spatial map. The activation of the + node in lo leads to
the activation of the cluster em with its role bound to the location returned by lo . em
estimates the mass of \red wagon" nd upon completion activates the + node of lo and res
the parameter node of lo with a rate indicative of the mass of the \red wagon".
The activation of the + and the parameter nodes in em leads to the activation of
the ? node in the ready cluster and the setting of the parameter representing the estimated
mass of the object to be pushed. In parallel, the activity of the second role of ready is set
by the parameter node in the enable cluster to indicate the desired relative force, and the
role of ready is bound to the location of \red wagon" by the role r2 in the lo cluster as
described earlier.
The activation of ready marks a signicant state in the execution of the push. At
this point the agent has carried out the necessary preparatory steps but has not executed any
actions that aect the environment. Thus ready provides a locus for high-level executive
processes to exercise inhibitory control over the actual execution of the push. It turns out
that these stages are extremely important for language understanding and as we will see the
use of such important control stages in our model of linguistic aspect, detailed in Chapter 5.
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 86
Ready activates clusters oh and ro concurrently and binds their roles to the
location of \red wagon". This leads to the concurrent activation of (i) the Reach x-
Schema that brings the agent's hand to the location of \red wagon" and (ii) the component
that shapes and orients the hand in a manner appropriate for carrying out the push action.
Since this component receives a binding specifying the location of the object to be pushed,
it can extract the shape of this object and determine the appropriate shaping of the hand.
reach and the hand orienting components execute independently and signal their
completion (or failure) by activating the + ({) node in OH and RO respectively. The
successful termination of these two components leads to the activation of the AF cluster.
Since the ? node of AF has a threshold of 2, it becomes active only upon receiving a
completion signal from both OH and RO. By this time AF also has its parameters set with
the appropriate values of estimated mass and desired relative force and its role bound to
the location of \red wagon". Now the nal component for applying the force is initiated
which initiates low-level motor schemas for issuing muscle commands to apply force on the
\red wagon". Once this component executes successfully it activates the + node of the
AF cluster which activates the + node of enable and signals the successful completion of
push . Finally the activity of the + node in the enable cluster is propagated via the switch
to the + node of the appropriate focal cluster in the schema that invoked push .
In addition to successful completion, failure is also communicated within the
schema. For example, the failure to locate the object, reach the object, or apply force
to the object can lead to the activation of the { nodes in LO, RO, and AF, respectively.
This would lead to the activation of the { node in enable and signal a failure of push . Al-
ternately, the failure of a step can trigger alternate plans or remedial strategies (not shown).
For example, the failure of reachcan trigger schemas for walking around an obstacle or
using an implement to extend the agent's reach.
The completion of a schema can also trigger an update of the agents memory.
Thus the completion of push can lead to the agent's episodic memory recording the fact
the agent has pushed the \red wagon". It can also lead to the remembering of the position
of the \red wagon" at the end of push . In the proposal, a memory lookup is treated as
being similar to a perceptual \look up". Similarly, the updating of episodic or working
memory is treated as being similar to an external motor action.
The encoding of x-schemas described above has been implemented in shruti (by
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 87
L.Shastri and Dean Grannes) and tested on examples such as the push schema shown in
Figure 4.2. Details of the model and the implementation can be obtained by pointing your
web browser to \http://www.icsi.berkeley.edu/~grannes/shrutihome.html/".
From the discussion above, we conclude that general purpose structured connec-
tionist binding and inference systems such as shruti can model x-schemas or parameterized
action controllers.
y = f (w t x ; ) (4.1)
Here, w and x are the weight and input vectors respectively. f is the step func-
tion and is a threshold value. The output y is binary. The weighted sum is represented
as an inner product between the weight vector w and the input x .
While it is easy to construct x-schemas to model the
P and functions that
perform purely spatial thresholding, we believe that that the fact that a biological neuron
may perform both spatial and temporal summation over its inputs is crucially important
in modeling cognitive functions. This property is best exploited in structured connectionist
models such as shruti (Shastri & Ajjanagadde 1993). Therefore, we require our x-schemas
to be able to model these more complex unit functions.
To explore the power of x-schemas to model such connectionist primitives, we
will look at the set of primitives common to most models of structured connectionism. In
particular, the primitives chosen correspond to the most advanced model of structured con-
nectionist networks, namely shruti. shruti is a general purpose structured connectionist
reasoning system that is capable of modeling a class of deductive inferences in a connection-
ist framework. The central idea exploited by shruti is the detection and usage of dynamic
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 88
synchrony in making and propagating run time bindings. In this section, we show how
the x-schema model is able to model the set of basic primitives used. We start o with the
x-schema implementation of a key node that appears in many connectionist models called
the 2/3 binder node. We then go on to the more complex spatio-temporal nodes such as
the -and and -btu nodes used in shruti.
Trash
Feature
5
2
Object
Value
The 2/3 binder node is a key node that has been used in many structured con-
nectionist models (Feldman and Shastri 1984 Grossberg 1985). The basic idea is to bind
< object, attribute, value > triplets in long-term memory. Consider the object apple. We
would like to store that the attribute color of the apple is red , the attribute shape of
the apple is round , etc. The crucial fact is that we would like to be able to retrieve
the third feature given the other two. For instance, given apple (object) and color (at-
tribute), we would like to activate red (value). Similarly, given the object (apple) and value
(red), we would like to activate the relation color (attribute). This is done through the
use of a binder node, which activates the third of its inputs, given the other two. The
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 89
reader may already have noted that binder nodes are a connectionist implementation of
the feature structures or \f{structs" described in Chapter 3. Recall that f{structs sup-
ply parameters and get updated as a result of x-schema executions. To specify a scheme
where there is mutual exclusivity between dierent values in a multi-valued \f{struct"
use a winner-take-all network that can be found in detail in (Feldman & Ballard 1982
Feldman & Shastri 1984).
Figure 4.3 shows the x-schema model of a 2/3 binder node. The model is fairly
straightforward. The basic operation is that if two of the inputs to the node become active
(are marked), the node becomes active and marks all (three) of its inputs. We need the
node labeled Trash to ush out excess tokens since the structure is always active. In the
current design, this happens every-time 5 tokens accumulate, but the parameter values can
vary.
1
D=1
2 D=1 Active
n
D=1
D=1
Drain
IN AND OUT
n D=1
TAU-AND NODE
TRASH
D =1
N = #phases
active
D=N
input output
inhibit
IN OUT
CLK_RESET
RHO-BTU NODE
purpose. Hence the hexagon. If multiple tokens appear within a single cycle, the rst one
is activated, others are discarded in one time step to trash. The phase at which the active
place gets marked is the phase at which the output transition will re in the next cycle. The
durative transition labeled D = N has a delay that res marks the place labeled output
at the same phase in the next transition. Thus if the input is periodic, the output will re
with the same period. If the input is noisy and non-synchronized, the output will be noisy
as well.
Using n -btu nodes, and two -and nodes is sucient the shruti system is
able to encode a predicate with n arguments. A long term fact is encoded in shruti using
a -and node with inhibitory links from the argument nodes and the appropriate entities.
Thus with these nodes, the basic functionality of shruti can be achieved.
From the discussion above, we conclude that x-schemas are able to model general
purpose structured connectionist reasoning systems.
CHAPTER 4. CONNECTIONIST INFERENCE AND X-SCHEMAS 92
4.3 Discussion
The two dierent parts of the system modeled above suggest that a close and deep
connection between high-level sensory motor control and general purpose inference. In this
context, it is interesting that the same mechanisms seem sucient for reexive inference,
perceptual motor schemas and the modeling of aspect. Actually this is not surprising if
one takes seriously the notion that language and conceptual structures are grounded in our
body and shaped by our motor and perceptual systems. In this thesis, we will see how x-
schema like models are useful of modeling linguistic aspect (Chapter 5) and for metaphoric
inference (Chapter 8). The use of such representations for hand-action verb acquisition is
explored in (Bailey 1997).
93
Chapter 5
cal marker system. This makes a compositional account of the semantics of aspect dicult,
giving rise to many paradoxes and problems (Dowty 1979 Moens & Steedman 1988). This
chapter demonstrates that a compositional semantics of aspect can be constructed if we
take the embodiment of action in a neural system seriously.
and is dened by its lack of reference to the internal temporal details of a situation. Russian
has special axes (such as pos) to denote the Perfective aspect. English does not have a
special construction to mark the Perfective, and the Perfective/Simple Past distinction in
English is context dependent. (Langacker 1987) denes the perfective aspect as a focus on
the end-points of the described activity. In contrast, the Imperfective aspect refers to a
the internal state of a situation namely viewing a situation from within. Here in the focus
is not on the end-points but on the internal structure of a process. The term viewpoint is
meant to connote that what is important is the speaker's subjective position with respect
to the described situation.
Linguistic devices can also refer to the internal temporal details of a single situation
by focusing on dierent logical stages phases of a situation. Such devices constitute the
Phasal aspect of an utterance.
Example 5 John is playing tennis.
For instance, Example 5 is an instance of the Progressive aspect (grammatical-
ized in English by the be + V-ing construction), and refers to the ongoing process of playing
tennis (a phase of the play tennis situation).
In contrast, Example 6 is one case of the Perfect aspect (grammaticalized in
English by the has + V-ed construction), indicates that some consequences of the described
situation hold at the time of description. Other cases are described in Section 5.6.1.
One central question in the analysis of aspect is how these three subsystems (the
viewpoint, phasal and inherent aspect) interact in interpretation. In other words, given
that dierent parts of a sentence or dierent linguistic devices may code for dierent subsets
of the features above, how does one construct an interpretation of the entire expression in
real-time?
iterate
resume
stop
Cancel interrupt Suspend Fail
Figure 5.1: The Schema controller is a generic structure that captures relevant features of
behavior control.
Crucially, the generic controller is itself an active structure or x-schema. The con-
troller sends signals to individual motor schemas and may transition based on signals from
the underlying schema. Thus nodes in the controller bind to process states in the underlying
x-schema. Directed arcs constrain behavior execution trajectories. The controller is stateful
and the control graph encodes possible process evolution trajectories. Thus if the ongoing
node of the controller is active, the activity in question has already started and that the
next interesting transition is to the termination, suspension, or iteration of the underlying
activity. While it is quite easy to argue that the controller abstraction is a useful one
for process control and planning, 2 it provides an elegant and motivated basis to ground
aspectual interpretation.
Figure 5.2 shows the x-schema version of the controller, that corresponds to
the depiction in Figure 5.3. Note that the hexagons indicating dierent transitions indicate
2
In fact one of the reviewers of (Narayanan 1997) informed us that the controller was very similar to
structure used in Behavior-based Mobile Robotics.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 99
Enabled
Iterate?
Interrupt?
Prepare
Ready Start Ongoing Finish? Done
Cancel Canceled
that any of these can be composed of sub-schemas (limited depth). In general, however it
is usually only the start-ongoing-nish? process (shown by the enclosing hexagon) that is
decomposed further. But of course, this is not always the case, since we can have preparatory
processes, starting processes, ending processes, stopping processes, etc. One thing to note in
Figure 5.2 is that whether an ongoing process is nished or interrupted or iterated depends
on other parameters supplied by the process features of the individual x-schema in question
or by specic world state or linguistic input.3
iterate
CONTROLLER X-SCHEMA
resume
interrupt
visual Visual ok
Coordination
VISUAL_OK AT DEST
test footing Take step
POSTURE STANDING
READY ENERGY=LOW
move test foot
until dest
STEP SUB-SCHEMA
WALK X-SCHEMA
for grounding verbal aspect. The process primitives characterize the inherent semantics
of the verb. Any individual verb may have some or all of these parameters set to specic
values. These parameters inspired by sensory motor primitives generalize Talmy's aspectual
primitives (Talmy 1985) and can easily derive the Vendler-Taylor-Dowty classication V DT
(Dowty 1979). For the curious, peeking ahead a little bit, it should be fairly obvious that the
V DT event/state distinction as well as the punctual/durative distinction are fairly easy to
capture in x-schemas, as is the notion of goal, and more unique to our model the notion
of dynamic resources. This allows us to model V DT accomplishments as durative x-
schemas with goals or some other well dened resource constraint that governs the ring the
nish transition. Achievements are goal-based punctual x-schemas. The important thing to
note is that both the controller and the process primitives that characterize the underlying
verb are x-schemas. In this way, the semantics of the verb is grounded in the execution
of the action itself. Figure 5.3 shows the same schema as the one in Figure 3.7 but now
redrawn to include the controller abstraction.
Linguistic devices specifying phasal aspect (lexical items, morphological modi-
ers and other grammatical devices) are like knobs which when set activate the correspond-
ing controller node, sanctioning which inferences can be made by the hearer, given the
same underlying schema (verb phrase). Languages may dier in which knob settings they
allow, and hence may vary which aspects and how much bandwidth they allow the speaker.
In Figure 5.3, we see how the English Perfect construction activates the result or
consequent state of the underlying verb phrase (x-schema). The x-schema is the familiar
walk x-schema now recoded as the dynamic interaction of two x-schemas, namely the local
controller abstraction and the underlying process of walking. Thus dierent walking
stages are now represented as nodes in the controller graph, and linguistic devices such as
phasal aspect speciers can activate any of these nodes. Node activation the changes the
execution state of the underlying x-schema resulting in a modied execution trajectory. For
instance, in the Perfect aspect in the example above, what is relevant are the characteristics
of the process that bind to the result stage of the controller. These bindings are the
consequences of the action/event. Propagation of these bindings and the general inferential
mechanism is described in Chapter 6. In the case of Jack has walked to the store, this implies
that the speaker directs the hearer's attention to the fact that Jack is possibly at the store,
a little tired, ready to do his shopping. We note that use as a phasal aspect is just one way
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 102
in which the Perfect is used in English. In general, Perfect can be used in conjunction with
tense to specify the relation between a past and current situation. The work described here
does not model tense, and is unable to handle these uses of Perfect. For ongoing work that
addresses this state of aairs, refer to Section 5.6.1 and Section 5.6.2.
In contrast to the Perfect, the simple past Jack walked to the store does not say
anything about whether any of the consequences of being at the store still hold. All it says
was that at some time in the past, Jack was at the store. Jack is walking to the store, on the
other hand, activates the process node of the controller thus calling the hearer's attention to
facts such as Jack is not at the store, but actively walking towards it, expending energy, etc.
Constructions such as the English about to V or ready to V or the Hindi thayar pick out the
state of readiness to engage in an activity. In our model this is the controller state ready ,
which in the walk x-schema corresponds to the visual test having returned ok (recall from
Chapter 3 and the posture in a state of readiness (standing ^ stable).
Thus, meaning arises from the dynamic binding of a specic activation state of
the controller x-schema to a specic activation state of the verb-complex x-schema. The
structure of the controller and the relevant set of features that characterize the verb-complex
jointly control the compositional possibilities.
Duration (Hobbs 1985 Nakhimovsky 1988) and others note that situations referred
to by verb phrases have durations that fall into specic time scales. The table below
shows some typical situations along with the associated time scales.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 103
Situation Duration
walk minutes
run minutes
cough second
read a book hours
read War and Peace days
sleep hours
Note that the precision with which we reason about the durations of activities seems to
cut the continuous time line into discrete time scales. Note also that the default time
scale can be overridden by specic information. For instance, the default duration of
reading a book can be overridden if we know the book to be of a specic size.
In our case, the x-schema model of durations in Figure 5.4 can certainly be modied to
allow only discrete time scales. However the tricky part comes from the fact that the
inherent time scale of an activity is not a xed, static entity. In fact we will soon see
how linguistic devices explicitly allow perspectival shifts by dynamic transformation
of time scales. We know of no computational model of aspect that is able to model
this phenomena. Our model is outlined in Section 5.4.2.
Periodicity
Talmy (Talmy 1985) characterizes verbal aspect as the pattern of distribution of ac-
tion through time. His characterization classies verbs based on the distributivity
properties of the semantics of verbs. The following classication scheme is a gener-
alization of the scheme of Talmy. The set of verbs can be classied in the following
categories.
1. Certain predicates refer to processes that are inherently Aperiodic. Such pro-
cesses may be due to some irreversible terminal condition obtaining (as in the
case of the verb die), or may be due to some world state that has to be reversed
before the process can be initiated again (as in the case of fall).
2. Certain predicates refer to processes that are inherently Periodic. At the end of
each period, the conditions that obtain in the world are such that the process is
re-enabled. The default periodicity of the processes may be 1 (as in the case of
ash or hit) or > 1 (as in breathe).
3. Certain predicates refer to processes that are composed of specic sequence of
a other activities. Consider, for instance the predicate waltz. This minimally
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 104
Start Finish
a) Durative
G
R
w f) Goal Based Enabling
k
Start Finish
b) Resource consuming
R
p
Intent
e) Effort
i) Resettable
Before proceeding further, we need one abstract parameter that is derivable from
the parameters mentioned earlier and the context. This parameter pertains to the relation-
ship of the time scale of an activity to the time scale of reasoning.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 107
In the rst case, a much larger deviation from 5 minutes (a minute) can be tolerated
than in the second (a minute cannot be tolerated). Time scales set up a way to partition
verbs into aspectual classes. Using natural time scales and the the relation much greater
than ( ), I propose an automatic method to construct a temporal abstraction hierarchy
to control the precision of reasoning. Section 5.8 species our formal model of reasoning in
multiple time scales.
The temporal abstraction hierarchy deals with both periodic and aperiodic pro-
cesses. Agents may reason at dierent levels of the temporal hierarchy and consequently
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 108
arrive at dierent conclusions about the same situation. Our shared experience of the world
allows the speaker to assume that the hearer will associate default values for the time scale
of reasoning based on natural units of the hierarchy.
The basic idea is if we are interested at a certain time scale si , we consider as
Processes, activities that happen on that time scale ( psi () ). Activities on much smaller
time scales are assumed to be instantaneous and qualify as Events. Activities on much
larger scales are assumed to be unchanging over the episode of reasoning and can be referred
to as Unchanging States. To discuss how the scales of reasoning aect the interpretation,
consider the following examples
Clearly in (1), most people would interpret the situation as referring to a single
instance of playing tennis, whereas in (2) the interpretation would be one which refers to
several instances of playing tennis (similar to the habitual interpretation I used to play
tennis). Why are the two interpretations dierent?
The only dierence between (1) and (2) is in the reference time scale. The activity
is the same, the Aspect Marker (the Progressive -ing) is present in both cases. The time
scale of the activity of playing tennis ( si ) is minutes. The reference time scale in (1) ( sR1 )
is hours, and in (2) sR2 is years. Hence sR2 si , and sR1 si . Hence, Equations 5.1 -
5.6 (Section 5.8) suggest that (1) be interpreted as a single process, and (2) as an extended
sequence of several individual tennis playing events.4
Reasoning in multiple time scales is a very useful abstraction to make since it
cuts down unnecessary cognitive processing. It also turns out to explain cross-linguistic
data. In fact, in languages such as Tamil and Spanish, there is no Habitual aspect, and so
disambiguation of sentences such as the ones above can only be done using a scheme like
the one suggested here.
4
This is true even in combination with temporal adverbials. For example, compare the following two sen-
tences This morning, when McEnroe was playing Tennis, he sprained his ankle and In 1991, when McEnroe
was playing Tennis, he sprained his ankle.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 109
1. States.
2. Punctual Activities.
3. Durative Activities.
4. Accomplishments.
5. Achievements.
Though the exact denitions vary across the dierent proposals, the main distinc-
tions between these classes is made in terms of the following three parameters
The table below shows how the various parameter settings constitute the various
classes.
Aspectual Class Parameter Setting
States !-d]
Punctual Activities !+a, -t, +d]
Durative Activities !-a, -t, +d]
Accomplishments !-a, +t, +d]
Achievements !+a, +t, +d]
Henceforth, the various classes mentioned above as well as the associated param-
eter settings will be referred to as the Vendler-Dowty-Taylor (VDT) classication scheme.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 110
V DT based classication schemes are a subset of the process parameters that are
used in our more ne-grained representation. The distinction between states and events is
ubiquitous in linguistics (Langacker 1987) and (Michaelis 1992) argues persuasively for its
being a basic distinction respected by all languages. The bi-partite nature of the x-schema
verb model inherently captures this linguistic distinction in a direct and natural manner.
The issue of dividing activities into durative and punctual is also directly modeled in the
x-schema framework that can represent transitions with real durations. The distinction be-
tween achievements and accomplishments is usually modeled as telic dynamic processes with
well dened end point. The dierence is that achievements are conceptualized in the V DT
schema as atomic actions (without duration or internal structure), whereas accomplishments
are telic, durative processes. In the x-schema based model, achievements are goal-enabled
transitions whereas accomplishments are goal-enabled processes. Goal-enabling is shown
in Figure 3.6 in Chapter 3.
While showing that we can derive the V DT scheme from sensory-motor abstrac-
tions points toward the epistemological adequacy of our specic verb classication scheme,
there are several interesting issues that arise in from our x-schema based semantics that
have not been properly explained by other theories. In particular, we feel that the inde-
pendent motivation for our model as a representation of actions useful for monitoring and
control endows it with explanatory power, that a descriptive listing of linguistic regularities
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 111
lacks.
States in the VDT system correspond to marking vectors of x-schemas. Recall that
a marking vector is a vector of the number and type of tokens in the various x-schema
places. Note that the x-schema state is distributed over the entire net. A given state
is thus a set (technically a multi-set) of marked places. In the simplest case, a state is
a single place. Entering the state involves ring some input transition (whose output
is the place in question). Thus ring the transition will mark the place (deposit a
token) signifying that the state holds. Once a state holds, the only way to leave the
state is to re an output transition (one of whose inputs is the place in question).
Firing an output transition consumes the token from place is not marked, and the
state ceases to hold. The downward sub-interval property of states (White 1991)
(which states (sic.) that if a state holds for a certain time interval T then it holds
for all sub-intervals ( 8ti T )) follows trivially from the fact that markings (states)
persist until drained by a transition, so if marking has been persistent for an interval
T it has been persistent 8 ti T .
Durative Activities in the V DT sense can be modeled by x-schemas with durative
transitions. But as we suggested earlier the whether an activity is durative depends on
two parameters, one of which is the intrinsic duration (or time scale), the other being
the reference time scale sR . Thus activities can be made durative if their intrinsic time
scale ( si ) is on the order of the reference time scale ( sR ), ( sR si ). An example
is the well known slow motion movie example of coughing which changes a normally
punctual activity into a durative one. A more interesting prediction of the model
(which needs further conrmation) is that if we increase the time scale suciently, we
may be able to view some slowly changing processes (treated canonically as states)
into durative activities as well. An example of this phenomenon is that in suitably
large time-scales, California cannot be at a location since it is drifting toward Alaska.
But under normal time-scales of reasoning, California is at some xed longitude and
latitude.
Punctual Activities in the VDT system correspond to instantaneous x-schema
transitions. Again, as in the durative case, activities can be made punctual by
changing the time scale of reference sR to be much greater than the intrinsic time
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 112
scale of a durative process ( si ) ( sR si ) (as was shown in the playing tennis exam-
ple in Section 5.4.2).
Accomplishments in the VDT system correspond to x-schemas that are either goal-
enabled or have certain resource or world conditions that can re the
nish transition
of the controller. This allows for the solution of various problems that arise from
the !+d +t ;a] classication of the VDT system. Specically, the imperfective para-
dox ceases to be a problem with our representation. Section 5.5.5 discusses this issue
in detail.
helps avoid some paradoxes with earlier accounts and explain some compositional proles.
Controller
iterate Iterative
resume
Inceptive(start, Completive
Imminent(ready, shuru (hindi))
about−to,
thayar(hindi)) lexical items may code for transitions
(ex. stumble, fall interrupt the walk schema)
with the underlying x-schema is sucient to provide the required interpretation in real-time.
This becomes critical in modeling the interaction of phasal and inherent aspect. The next
few sections esh out this assertion further.
initial activation (marking) to the controller. For instance, consider the situation faced by
our interpreter upon hearing the utterance Jack has walked into the store. Figure 5.3 shows
this situation graphically.5
An example is shown in Figure 5.3. The Perfect perspective that results from the
has walked to the store results in a specic activation to the result stage of the controller
x-schema resulting in a specic binding to the walk situation. Here the hearer is sanctioned
the inference that Jack is at the store and possibly tired. More importantly, using the
Perfect aspect the speaker sets the context for future discourse that the world conditions
and agent state at the time of description are the consequent state of having walked to the
store. The architecture to model such inferences is described in Chapter 6. Note that this
use of the Perfect is only one of several possible uses. To properly model other cases requires
the ability to model tense-aspect interactions, a subject not dealt with in this work. Some
idea of the issues involved can be found in Section 5.6.1.
Contrast this to the case of is walking to the store. A few steps of the simulation
output is shown in Example 7.
1 = started
2 = activate
3 = ready
4 = ongoing
5 = done
6 = VISUAL_OK
7 = STEP_READY
8 = VISUAL_NOTOK
5
For exposition purposes, the test-foot branch is not shown.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 117
9 = BYPASS?
10 = Energy
11 = Dist(Store)
12 = At(Store)
Simulation:
Marking 1 2 3 4 5 6 7 8 9 10 11 12 | Transitions to
========================================================= | --------------
M0 | 0 1 0 0 0 1 0 0 0 14 10 0 | ( M1 )
M1 | 0 0 1 0 0 1 0 0 0 14 10 0 | ( M2 )
M2 | 1 0 0 0 0 1 0 0 0 14 10 0 | ( M3 )
M3 | 0 0 0 1 0 1 0 0 1 14 10 0 | ( M4 )
M4 | 0 0 0 1 0 1 1 0 0 14 10 0 | ( M5 )
M5 | 0 0 1 1 0 1 0 0 0 13 9 0 | ( M6 )
M6 | 1 0 0 1 0 1 0 0 0 13 9 0 | ( M7 )
M7 | 0 0 0 2 0 1 0 0 1 13 9 0 | ( M8 )
M8 | 0 0 0 2 0 1 1 0 0 13 9 0 | ( M9 )
M9 | 0 0 1 2 0 1 0 0 0 12 8 0 | ( M10 )
M10 | 1 0 0 2 0 1 0 0 0 12 8 0 | ( M11 )
========================================================= | -------
1 2 3 4 5 6 7 8 9 10 11 12
ready
vis_ok
step
done
primitives such as the inertial world primitive Inr (Dowty 1979) (set of predictable world
futures) to establish truth conditions that satisfy this test.
Figure 5.8 and Figure 5.7 graphically depict the relevant portion of the Marking
vector for the situations described by the two sentences. In our action-based x-schema
model, the dierence comes from the constraint that in one case the nish transition can
re or equivalently the done state is reachable i only if the goal (reaching the store)
obtains. It is important to note that goal attainment can be asynchronously asserted by
a scheduled or unscheduled perceptual process (or can be time based (I saw Jack walking
to the store yesterday)), and the x-schema reacts appropriately. In the case of walking,
no such constraint exists and the result obtains after every two steps (taken from Dowty's
denition (Dowty 1979).
Technically the resolution of the paradox relies on whether the reachability graph
of the active x-schema contains the done state as a sub-marking, a question that can be
answered by activation propagation over the x-schema as shown in Chapter 3. Example 9
shows the actual results clearly showing the case where the done state is readily obtained
for the walking scenario and and results shown from the simulation in Example 8 show that
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 119
ready
vis_ok
step
at store
done
Figure 5.8: Interpretation of walking to the store as a trace of the relevant portion of the
Marking vector
the done state is not reachable from the initial marking for the walking to the store scenario.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 120
1 = started
2 = activate
3 = ready
4 = ongoing
5 = done
6 = VISUAL_OK
7 = STEP_READY
8 = VISUAL_NOTOK
9 = BYPASS?
10 = Energy
11 = Dist(Store)
12 = At(Store)
Reachability Simulation:
Example 8 shows the reachable states for is walking to the store. The key here
is that the done state is never marked until the destination is reached (time 40 ) in the
simulation6 (depends on distance, rate, etc.). More importantly, while the destination is
still to be reached, the process may be interrupted, stopped or resumed. For instance,
the narrative John was walking toward the store when he met his friend, signies process
interruption with an indeterminate future action progress. We believe that an x-schema
based dynamic model is essential to capture the right inferences of such an utterance.7
Contrast the simulation in Example 8 the that in Example 9 which shows the sys-
tem interpreting is walking . Here after an initial setup (to coordinate, initialize perceptual
tests, etc.), the done status can re after every 2 steps (as dened). Thus the done state
is reachable after every two steps.
6
Note that the line in the middle of Example 8 signies that the simulation results for the intervening
period are not shown.
7 An interesting side-e ect of the simulation is that the ongoing place is able to count steps (something
Example 9 Walking
Under the denition of walking where two steps constitutes a walk (from Dowty
1979), we can see how our simulation shows that the done state is reachable after every two
steps, thus allowing the agent to infer that \is walking" implies \walked".
x-schema place to column mapping
1 = started
2 = activate
3 = ready
4 = ongoing
5 = done
6 = VISUAL_OK
7 = STEP_READY
8 = VISUAL_NOTOK
9 = BYPASS?
10 = Energy
Reachability Simulation:
From the discussion above, we conclude this section with the following points to
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 123
note with respect to our representation and its avoidance of the imperfective paradox are
the following.
1. The imperfective paradox is a result of using a time based representation, where the
only primitives are world evolutions over temporal intervals or points. This gives
actions and agent-initiated (directed) world evolutions and other external events and
occurrences equal status, making it impossible to model the attendant issues of in-
tention, resources, and control. In this case, we are in concord with (Steedman 1995)
in noting that an action-based representation is an essential rst step to avoid the
imperfective paradox.
2. The ability of our model to dynamically model the eects of resources and voluntary
or involuntary suspensions seems to be essential to capture the issues involved. For
instance, reachability is adversely aected in walking to the store not just in the
case when the store is not reached, but also as a result of an involuntary suspension
(became too tired or fell on the way) or a voluntary one (the required items had
been obtained in some other way, retracting the goal of being at the store). These
kinds of dynamic action-features are naturally modeled in a highly-responsive, active,
asynchronous action controller, aka x-schemas.
Importantly, our model has explanatory power which allows us to conclude that
an activity with a specic goal or telic constraint (ref. to our modied denition of telic-
ity in Section 5.5.2) may not nish because of lack of energy or resources, or because of
suspension or stoppage due to an external or internally generated interrupt (voluntary or
involuntary suspension). An activity without such constraints may
nish when a specic
control sequence is completed. This explanatory power can be brought to bear on the ac-
ceptability of certain but sentences.8 . For instance I was walking to the store but became
tired is acceptable but (sic.) I was walking to the store but the Mars Rover landed per-
fectly is unacceptable. One of the well known uses of the lexical item \but" is to provide
information that contradicts the expected future trajectory of an action. So by asserting
resource unavailability (energy), tiredness contradicts the expected future trajectory of the
walking action and is acceptable. Mars Rover landings (in most contexts) do not, and the
8
Thanks to George Lako for pointing this out
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 124
second sentence is thus unacceptable. Note that a rich action representation which models
resources, preconditions, goals, and interrupts is key to making judgments of acceptability.
In summary, our dynamic action model grounded in sensory-motor primitives
avoids/solves the Imperfective Paradox implicitly without resort to any unanalyzed prim-
itives such as Inertia Worlds. The important thing to note is that unlike previous eorts,
we needed no special language-specic meta-rules (such as type coercion networks). Our
solution is part of the normal execution of the model itself and a direct result of language
being grounded in action.
iterate
resume
terminal cease to
illness exist
health worsens
rates ( )(ref. Chapter 3). While Figure 5.9 shows how we can model Comrie's reading
of the verb die without introducing new primitive Aktionsart classes, we can safely predict
that the work described in this thesis will not for-close discussion on the meaning of death.
9
\die" should really be viewed as taking the interrupt-abort trajectory of a top level Stay Alive schema
in the same way that the concepts \Trip" and \fall" refer to the interrupt-abort trajectory of the Walk
x-schema.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 126
iterate
resume
In our model, certain activities (those that are inherently iterative) preferentially
enable the iterate transition, rather than the nish transition. Thus in Figure 5.10, this
corresponds to executing the rub several times before nishing (how many may depend on
other resources such as energy, or goal being satised).
The standard account of the meaning of this sentence is that coughing is a punctual
verb, and the only way it can appear in a Imperfective perspective (since the Imperfective
perspective requires a Process verb) is if the process is iterated. This is a result of the
inherent punctuality of cough.
The argument stated above does not work. For instance, if I change the time scale
of reasoning by showing a slow motion movie of a cough and saying Look he is coughing, I
could be pointing to the process of a single cough.
Instead of the kind of argument made above, we propose that the punctuality of a
verb is a derived entity that depends on the interaction between a Time Scale of reasoning
and the time scale of an activity. In most contexts, since the time scale of a cough is much
smaller than the reference time scale, a cough counts as an event and needs to be iterated
to form a process. However, upon changing time scales of interest, a cough can become a
process.
Thus while certain verbs encode the underlying action as punctual or durative,
specic context including a reference time-scale can explore the actual process or treat the
activity as punctual. Our experience of most activities seem to fall into the minutes time
scale, and so the default interpretation would classify activities in the second time scale
(like cough) as a punctual verb.
The Multiple Scale Hypothesis can also elegantly explain when a durative verb
can be recast as a punctual verb. We saw an example of this modication in the earlier
section with the play tennis situation where a shift in the level of the temporal resolution
resulted in the shift in the Aspectual Class of the situation.
the two transformations involved. Being a graphical model, we can specify both information
losing and non-information losing net transformations structurally.
Begin Ongoing End
in out
ABSTRACTION
(information losing)
in out
NET TRANSFORMATION
(no information loss)
Begin End
Event
The lack of subnet monitoring involves the abstraction transformation (we don't
monitor the detailed subschema execution anymore). Obviously this results in information
loss compared to the earlier situation (depicted on top of Figure 5.11). The second trans-
formation is possible because, the marking at the ongoing node can never be visible since it
instantaneously appears at the output. Hence no information is lost in the transformation
from the middle to the bottom of Figure 5.11.
This situation is depicted in the middle of Figure 5.11. When applied to the
controller, the result is shown in Figure 5.12.
Figure 5.12 shows one possible abstraction from the controller where the process
is not monitored, only starts and nishes are. In this case, through an information loosing
net transformation (Figure 5.11), we get a a simplied controller that corresponds to the
the perfective perspective present in many languages (Langacker 1987).
Once again, the our process-based model has explanatory power. For instance,
the model predicts that a perfective perspective may allow iteration but not interruption
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 129
iterate
Imperfectivize Perfectivize
iterate
resume
or cessation (since transitions are not interruptible). So in our model, the sentence John
swung at the ball but stopped would be marked, but John was swinging at the ball but
stopped would be ne. Similarly the iterated perspective John swung at the ball three times
is acceptable.
Note an interesting transformation occurs in going from the imperfective to the
perfective viewpoint. In the imperfective case, what started out as focus on a state (marking
the ongoing controller state) after the network transformations in Figure 5.11 ends up as
an event (transition) with a before state and an after state. This account is consistent
with and provides a computational model for the intuitions of (Michaelis 1992).
have lost my keys) and the simple past (as in I lost my keys) comes from the fact that in the
Perfect case one can't follow the sentence with an explicit denial of the consequences (as in
I have found them again). The simple past makes no such claims. An interesting example
suggesting that the Perfect is not merely temporal comes from the example I have bathed10.
In normal contexts this implies that the consequences of taking a bath (I am clean, don't
immediately need another bath) holds. Note that the relevant time intervals where such
a statement can be made are entirely context dependent, since for me the interval would
span a day, while for a medieval monk, the interval might well be a year.
The perfect of result is only one of the uses of the Perfect. Other uses pertain to
the relationship of the current situation to some past situation. Consider cases like We have
lived here for ten years. Here, what seems to be asserted is that a situation that started in
the past continues to hold currently. The pluperfect (or past perfect) as in John had lost
his keys shifts the reference point back to the past, while the sentence still asserts that the
consequences of the described event (the keys were lost) held at the past reference time. In
both cases above the model described here has to be modied to include a representation
of tense. Initial steps in this direction look promising and are described in Section 5.6.2.
However, the general subject of tense is not dealt with in any depth in the current model.
The meaning of Perfect in some languages is complicated further since the specic
kind of temporal relation asserted varies. For instance, in English, use of the Perfect
construction can signal that the described event has occurred at least once in the past.
Such uses of the Perfect are called existential perfect (Comrie 1976)11. An example of
this is John has been to India, where the sentence asserts that John was in India at least
once (possibly more than once). Contrast this with John has gone to India which uses the
perfect of result described earlier. Only in the latter case do the consequences of going
to India (being there, vacationing) hold at the time of description. However, even in the
case of the existential perfect, contradicting the consequences seems to be disallowed. A
classic case of this phenomenon is McCawley's example (McCawley 1971) *Einstein has
been to Princeton which seems marked (unless the context is very special, like in a play
about Einstein). Here, we argue that the consequent state of the subject (Einstein) being in
Princeton is contradicted by our knowledge that Einstein is dead. This direct contradiction
10
I am assuming a denition of bath that includes taking a shower (bath = soaking in tub _ showering).
11
Many other languages (such as Chinese, Hindi) have di erent constructions for the existential perfect
and the perfect of result and make grammatical distinctions between the two types, while English doesn't.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 131
renders the use of the Perfect infelicitous unless supported by a special context. Note that
our analysis would make no such prediction about Princeton has been visited by Einstein,
since Princeton as a subject still exists. In addition, in the case of the existential perfect,
one can specify a time-interval as in I have been to Germany after the Berlin wall came
down, or an exact number of times as in I have been to Africa three times. In all these
cases, exactly which consequences are relevant and cannot be contradicted by the use of
existential perfect seems highly context dependent and the work described here oers no
general theory to solve that problem.
The Perfect aspect is also used to signal \temporal closeness" of an event to the
time of description. For instance, in English, I can say I have been to the dentist this
morning while it is still morning. Exactly what interval sanctions the use of the Perfect is
language dependent. For example in English one doesn't say *I have been to the dentist this
morning, in the afternoon (since the morning is over) while Spanish allows the equivalent
expression. Note that while temporal closeness may allow the use of the Perfect, it is clearly
not a requirement as evidenced by the I have bathed example.
In both the existential perfect and the temporal closeness use of the Perfect, we
suspect that the inherent time-scales of activities, their periodicity, and the existence of
clear and enduring consequences are important determinants of when the Perfect can be
used and its meaning. However, working out a theory of these uses of Perfect awaits further
research.
consequences
consequences
E R S E S, R
John had lost his keys. John has lost his keys.
E R S S RE
John was walking to the store John is walking to the store
R E S R E S
John had started walking to the store John was starting to walk to the store
consequences
S E R S E R
John will be walking to the store John will have walked to the store
The cases shown in Figure 5.13 pertain to the use of the Perfect and Progressive
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 133
aspects in conjunction with dierent tenses. The representation of tense is done using the
three pointers R, S, and E while the representation of aspect uses the model developed in
this chapter. Consider the case of the past perfect as in John had lost his keys shown in the
top-left corner of Figure 5.13. In all cases the dark downward arrow points to the speech
time S. Here the event refers to the entire composite event of losing keys. In all cases, E
refers to the projection onto the time-line of the entire controller graph. In this case,
the event is entirely in the past and E labels the composite event. The reference point
R on the other hand, refers to some time where the consequences of the completed event
(keys are lost) still holds. This point is also in the past, indicated by R in the gure. The
projection of the reference pointer R onto the time line is shown by the shaded rectangle
(R spans the shaded interval) in the various situations depicted in Figure 5.13. The result
node is shown shaded to indicate that the inferences and bindings made from this stage
of the process hold for the time indicated by R. In contrast, the use of the present perfect
shown on the top-right hand corner of the gure, is dierent in that the consequences of
the lose key situation hold at speech time. Hence R spans the speech time S. The rest is
the same as before.
The progressive construction refers to the ongoing node of the controller. Thus
the reference pointer, R is placed somewhere on the projection onto the time line of the
ongoing process. The past-progressive shifts the event into the past, thus E is in the past,
and R refers to the ongoing phase of the past event. This is the situation shown in the middle
left gure that models the tense-aspect specication for the sentence John was walking to
the store. In contrast, the sentence John is walking to the store has the speech and reference
time referring to the ongoing state of the Walk(to store) process. The aspectual inferences
are exactly the same, except the reference and speech times are dierent in the two cases.
The third row in Figure 5.13 shows how our proposal can model some more compli-
cated interactions between tense and aspect. For instance, dierent nodes in the controller
graph can compose with other tense-aspect combinations. For instance, past perfect can
combine with the \starting" process allowing the system to model John had started walking
to the store, shown in the left column of the third row. Note that in this case, the start
process has been completed and the consequences hold (during the reference interval, John
is actively walking toward the store, hasn't yet reached, etc). The reference point (R) is
to that state of the underlying x-schema. As in the earlier cases, the event and reference
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 134
point are in the past compared to the speech time (the past tense). In contrast, the sit-
uation on the right column of the third row in Figure 5.13 models the sentence John is
starting to walk toward the store, which places the reference point on the ongoing node of
the start sub-schema (not shown) covering the time span shown shaded in the gure. Note
other tense-aspect combinations such as John was ready to walk, or John was
nishing his
homework can be similarly modeled.
Finally, the future progressive shown in the bottom-left of Figure 5.13 shows the a
future E and an R that refers to an ongoing process. The future perfect, on the other hand
refers to the consequences of a completed event in the future. This is the situation shown
in the bottom right corner of Figure 5.13.
Thus the basic cases of Perfect and the phasal aspects seem to compose nicely
with tense in the manner indicated in Figure 5.13. However, the simple case of using three
pointers to model tense leaves much unspecied, since in general discourse the dierent
temporal relations are often nested, as in He said that he had arrived in India three weeks
earlier, where there is a current speech time (S), the speech time of the speaker in the
narrative (S'), the speech event of his saying (E), the event of his arrival (E'), etc. In an
ongoing project, Dan Gildea 12 is attempting to use a recursive structure called tense trees
(Hwang & Schubert 1994) (based on an extension to Reichenbach's system) in conjunction
with the aspect model described here to capture some of the more complex embeddings in
a computational model.
5.7 Conclusion
The main focus of this chapter has been to provide evidence for the proposition
that the semantics of verbal aspect is grounded in primitives of sensory- motor control.
To this end, we outlined a novel computational representation and simulation, inspired by
well known perceptuo-motor process features, that captures interesting distinctions made
by aspectual expressions while avoiding some paradoxes in standard accounts. Recent work
has shown that the representation can deal with particle constructions (such as eat up, back
o
) and temporal adverbials (Narayanan 1997 Chang 1997).
The active, dynamic, highly-responsive nature of x-schemas enables them to model
real-time, defeasible inferences. This novel feature of our model distinguishes it from previ-
ous attempts to model aspect (Moens & Steedman 1988 Steedman 1995), and allows for a
natural solution to the attendant issues of context-sensitivity and inference. We note that
(Steedman 1995) proposes the use of dynamic logic to represent the semantics of tense and
aspect. In this context, we are exploring the connection between x-schemas and a dynamic
version of situation calculus. Other related work shows an equivalence between the mul-
tiplicative fragment of linear logic (Girard 1987) and x-schemas. Chapter 10 details this
connection further.
CHAPTER 5. A COMPUTATIONAL MODEL OF VERBAL ASPECT 136
Aperiodic Processes
The basic idea is if we are interested at a certain time scale si , we consider as
Processes, activities that happen on that time scale ( psi () ). Activities on much
smaller time scales are assumed to be instantaneous and qualify as Events. Activities
on much larger scales are assumed to be unchanging over the episode of reasoning and
can be referred to as Unchanging States.
Let p() be an Aperiodic process whose time scale (start to nish) is si . and sR is a
reference time scale of reasoning. The following equations set up the natural hierarchy
for aperiodic processes.
Chapter 6
A Compositional Theory of
X-schemas
So far we have looked at single verb schemas and their interaction with con-
troller x-schema. In this chapter, we are concerned about extending the representation
to be able to model entire domain theories. To this end, we have extended the represen-
tation to allow for inter-x-schema activation and inhibition. We have found the reied
controller discussed in Chapter 5 useful for this purpose as well, since it eectively de-
creases the combinatorial coupling possibilities quite drastically, and allows for an elegant
theory of x-schema composition.
The central idea behind our compositional theory is that transitions in the con-
troller graph set, get and modify values of the Agent's state (the Agent state f{struct).
This allows other x-schemas to be activated, inhibited, or their execution modied in some
manner. The model is able to exploit the dynamic, active, and highly-responsive nature
of x-schemas. The rationale for the highly compiled representation is obvious since quick
responsiveness, tight-coupling between action and reaction, built-in obstacle avoidance and
failure recovery strategies, etc. are required for survival while moving around. Our claim
is that this degree of reactivity is crucial for language processing, and that we can use the
same active sensory-motor representation for language understanding. The next chapter
describes our computational model for metaphoric interpretation, which crucially depends
on the quick inferences provided by x-schema execution.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 139
The controller graph allows us to distinguish between cases of sequential and con-
current x-schema triggering or inhibition. Additionally, we are able to model the case
where the execution of an x-schema is able to interrupt , terminate , or otherwise modify
the execution trajectory of another x-schema. Interestingly, it appears our design could
eventually lead us to new, massively-parallel, ne-grained asynchronous production system
implementation.1 Here an individual x-schema (which we have used so far for representation
of actions and concepts labeled by verb phrases) now becomes a ne-grained production
with internal state (corresponding to the marking of the controller graph), which gets trig-
gered if the Agent State f{struct matches its triggering conditions. Whenever the executing
x-schema makes a control transition, it potentially modies state, leading to asynchronous
and parallel triggering of other x-schemas. We believe such a system design supports a
broad notion of action, in that the same active representation can be used for motor control
and inference. As was explained in Chapter 3 our notion of state is completely distributed
over the entire network, so the working memory of an x-schema based production system is
distributed over the entire set of productions. However, detailed implementation, analysis
and conict resolution in a general purpose x-schema based production system is denitely
future work and will not be pursued further here.
An important and novel aspect of our representation is that the same system is
able to respond to either direct sensory-motor input or other ways of setting the agent state
(such as linguistic devices). This allows for the same mechanism to perform simulative
reasoning and generate inferences from linguistic input as well as be used for high-level
control and reactive planning. We believe that this is an important aspect of embodiment
allowing the same mechanisms to reason as well as react. As we show in Section 6.3, the
graphical nature of our representation (x-schemas are bi-partite, cyclic graphs) allows us to
formally state and reason about inter-schema relations declaratively while using their real-
time execution capability for inference. This is a key property of our representation (that
it can be viewed as procedural or declarative) that makes it quite dierent from traditional
production systems. We believe this property to be essential for representations that are to
be used for both planning and reasoning about plan descriptions.
To motivate and illustrate our representation, we will use an example of embodied
1
As in many of these instances, the connection between x-schema composition through f{struct modi-
cation and production systems is something that Jerry Feldman recognized a-priori.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 140
motion described in Section 6.1. This example is suciently interesting and representative
to allow us to walk through the central concepts and mechanisms of our compositional
theory. We will then formally dene the representational primitives of our theory and
compare it to related models that have appeared in the Robotics and Cognitive Science
literature.
Example 11 During a Walk, and while taking a Step, if you encounter an unanticipated
bump, or if the ground is slippery, you become unsteady which leads to you to Trip. This
may lead to a fall unless you are able to simultaneously expend energy and Stabilize,
in which case you may resume the interrupted step . If you are unable to Stabilize, and
thus Fall, you will be supine and hurt. In order to start walking again you will have to
Get up and be standing again.
This example will be used in detail throughout the chapter to illustrate the issues
involved. To encode the example above, we need to be able to model each of the italicized
inter-schema (x-schemas are in uppercase) relations which are mediated by the state
variables in boldface above. In general, we need to be able to link individual verb x-
schemas together to yield a compiled x-schema representation of the embodied domain. We
rst exhibit the mechanism for inter-schema relations by going through our model of this
example in detail. We will then be in a position to generalize and formally dene the various
inter-schema interactions involved.
simplicity, the example shown in Figure 6.1-Figure 6.11 uses only binary valued world-state
variables and integer valued resource variables (like energy). In the next chapter, we will see
how the use of typed tokens carries variable binding information about specic objects and
entities and allows us to distinguish dierent walking agents, etc. In Figure 6.1, the presence
of a token indicates that the relevant condition is true, a number inside the place signies
an integer value of the number of tokens at the relevant place. Note that in the example
shown, there are four schemas, namely walk (with a step subschema), fall , stabilize ,
and getup). Note also that the trip x-schema is modeled as a simple transition.
From Perceptual TEST_FOOTING X-schema
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
GOAL-ENABLE RESUME
GOTO(DEST) INTERRUPT
WALK CONTROLLER X-SCHEMA OR
SLIPPERY(GR) GETUP
AGENT STATE
Enable SLIP, SLIDE X-schemas
Figure 6.1: The initial state and x-schemas modeling the domain theory shown in the
example.
Figure 6.1 shows the wiring structure and initial state. As explained in the last
section, specic resources, world-state and agent force-dynamic and emotional state may
trigger, inhibit, modify, or be produced as a result of a transition in the controller x-
schema. For instance, the trip transition results in loss of control of one's location (signied
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 142
by the consumption of a token from the control(loc) state variable), and occurs as a result
of some instability. Such instability may be produced by the presence of a bump causing
an ongoing step to be interrupted. Note that the agent has some reserves of energy in this
case, 2 units, signied by the integer 2 at the appropriate state variable. Initially, the
agent is assumed to be in control of his location, indicated by the presence of a token at the
control(loc) state variable, standing (token at standing ), and stable (token at stable ).
Figure 6.1 also shows the expansion of the ongoing place of the walk controller
to the associated subnet as was discussed in Chapter 3. Remember that the notation of
having an ongoing place is realized by hierarchical transitions representing the invocation
of a sub-schema.
The agent has a goal to be at a location DEST , and when that location is
reached, there will be a token at the state variable AT (DEST ) . The absence of such
a token, i.e. not being at a destination, is sucient to enable methods of getting to that
destination. This is an instance of goal-based enabling, which basically allows for goals to
enable x-schemas if they are in the Agent's state. The link from the
nish transition of a
walk automatically sets the goal state of being at the destination, and the walk x-schema
(and other x-schemas whose sole enabler was the AT (DEST ) variable) are disabled.
Figure 6.2 shows the walk x-schema enabled as a result of the goal not being
satised. We adopt a convention enabled transitions and input (output) arcs are shown in
heavy boldface.
In the presence of a goal to GOTO ( destination ), if the walk x-schema is
not already enabled and if the agent is not already AT (destination) , then the walk x-
schema becomes enabled . Once the destination is reached, i.e. a token is present at the
AT (DEST ) node in the Agent's State, all x-schemas that became enabled due to the
GOTO(DEST ) goal become disabled. Any residual token at an enable place can timeout
through the disable transition. The relevant x-schema will not be enabled again until the
goal of reaching some destination is asserted and the agent is not at that destination.
In general, goals are special type of state variables. Goal assertion is a method
to enable the relevant schemas. However, they are also resources in that the successful
completion of the relevant schema consumes a token from the goal variable, thus retracting
the requirement to satisfy the goal. Thus, there is a resource link from the goal to the finish
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 143
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
GOAL-ENABLE RESUME
GOTO(DEST) INTERRUPT
WALK CONTROLLER X-SCHEMA OR
SLIPPERY(GR) GETUP
AGENT STATE
Enable SLIP, SLIDE X-schemas
Figure 6.2: Not being at the destination enables an unenabled walk x-schema.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 144
transition of the relevant enabled x-schemas (omitted from the gure for the walk x-schema
for legibility).
Goal-enabling transitions are instantaneous and so transition enabling and token
transfer take place in a single step of the simulation. The bold lines show the enabled
transition, its input and output links and the enabling of the walk x-schema that continues
to be active until goal attainment.
An important aspect to note is that either goal retraction or goal attainment
(being at the destination) could be done by any specic enabled x-schema (or asynchronously
asserted by some higher level process such as specic linguistic input), and the system will
react appropriately by disabling all the associated x-schemas.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
Figure 6.3: The walk schema is enabled, perceptual test returns ok and the walk is
ready .
Figure 6.3 shows the walk controller executing the preparatory steps for a walk.
The bold links specify either an input to a control transition that has the required number
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 145
and type of tokens, or an output link of an enabled transition which has received a token of
the appropriate type. So, in Figure 6.3, the V ISUAL OK has returned an ok status thus
the link from that place to the prepare transition is shown in boldface. Also, since all the
preparatory steps are completed, the prepare transition is ready to execute the next step
function described in chapter 2.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
Once the preparatory stage has been completed, the walking starts. The start
transition signies this change. Figure 6.4 shows the start of walking. Note that there is
still a token at the enable place (signifying that the walk x-schema is enabled) and the
start transition is enabled, indicating the start of walking event. Recall that usually the
start transition is by itself an atomic event, but not always, since it may also have sub-
schemas indicating a starting subsequence. When the starting process is complete, the
token is transferred to the place labeled ongoing .
Once the walk transition is ongoing , individual steps are taken. The mecha-
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 146
nism is exactly as described in chapter 3 and Chapter 5 . The presence of a token in the
ongoing state of the walk remains until the destination is reached (or the walk stops for
some other reason). The token in the step sub-schema circulates and iterates every step.
This is the situation shown in Figure 6.5 and Figure 6.6.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
At some point, suppose a bump is encountered. Then the presence of an ongoing Step
combined with the condition signifying the presence of a bump leads to the situation in Fig-
ure 6.7.
The situation in Figure 6.7 pertains to the agent taking a step, shown by the
token in the ongoing controller node of the walk x-schema and of the subschema step .
Note that the presence of a bump enables the interrupt transition of the step , which
translates to interrupting the walk x-schema as well. The situation depicted shows an
the presence of a bump during an ongoing step. Note that bump is a world condition
that has bi-directional arrow to the transition that leads to interrupting a walk (and may
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 147
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
have other eects as well not shown in the example), implying that it is an pre-condition
to the transition. There may be many such world and agent conditions that may result
in interrupting the walk x-schema, including the ground being slippery, etc. Thus the
rounded edge box implements the OR of its inputs. Also, we note that the transition to
interrupt the walk is instantaneous (others have a default duration of 1 , unless otherwise
specied). Thus the transition enabling and token appearance is done in a single step of
the simulation.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
Once the interrupt transition res, the agent state is now changed, to reect the
resulting instability. This is the situation depicted in Figure 6.8. The trip x-schema now
becomes enabled (which is shown as just a transition, with no internal structure to show
the dierent possibilities). The ring of this x-schema results in consuming a token from
the agent state corresponding to controlling one's location, indicating that the agent is no
more in control of his location. Also, we note that the agent continues to be unstable after
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 150
tripping.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
Tripping leaves the agent unstable, and acts as a goal to enable the stabilize x-
schema. Intentional schemas can be enabled by asserting the need to satisfy a goal. In this
case, removing stability results in the continuous enablement of the stabilize x-schema (of
course, if there are multiple ways to satisfy a goal, the corresponding x-schemas become
enabled as well).2
An important thing to note about our model of goal-enablement is that if some
other process retracts the goal, the corresponding x-schemas cease to be enabled, and may
2
In current model, the default conict resolution strategy is free choice(unbiased coin-toss). At the
neural/connectionist level, the response is inherently graded and conict resolution works by enabling x-
schemas to di erent degrees of activation and using winner-take-all networks to resolve multiple active
schemas. At the computational level, we hope to use the stochastic nature of the x-schema transitions to
implement conict resolution strategies where the rate of transition rings is determined by sampling from an
exponential distribution. This way we can set stochastic priorities to transition rings. The implementation
details and analysis of such a scheme is not addressed in this thesis.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 151
timeout based on the desired responsiveness and the details of the design.
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
The enabled stabilize schema is unable to execute, since the agent does not have
sucient energy or postural control, hence the fall schema starts executing the agent loses
control and in the absence of any other interrupting source, becomes supine and hurt as
a result of the fall. This is the situation depicted in Figure 6.10.
As a result of the fall shown in Figure 6.10, the agent is not standing ( :standing )
anymore which enables the getup schema, which cannot execute currently, since there are
insucient energy resources. At this point some other energy producing source may be
triggered using the same goal-based enabling mechanisms discussed earlier, to be able to
execute the getup schema. But this is not modeled in the simple illustrative example.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 152
VISUAL_OK(GR)
PREPARE Ready START Ongoing FIN Done
Ready START Ongoing FIN Done
HURT
DISABLE CANCEL
SUPINE
BUMP
RESUME INTERRUPT
STEP AT(DEST)
FORCE = HIGH
SLIPPERY(GR) GETUP
AGENT STATE
Activation can be partial in that some of the preconditions of the activated schema
are set as a result of executing the activating x-schema. If all the requisite preconditions
and resources of the activated x-schema are set, we say that executing the rst x-schema
enables the second. Note that in contrast to other theories, we are able to distinguish
concurrent from sequentialn activation.
Inhibiting links prevent execution of the inhibited x-schema by setting an in-
hibitory input place. Again, our model is able to distinguish between concurrent and se-
quential inhibition as well as be able to model mutual inhibition and aperiodicity.
Modifying relationships between x-schemas occur when the execution of the mod-
ifying x-schema results in setting the Agent State in such a way the the currently active
modied x-schema undergoes a controller state transition. For instance the execution of
the modifying x-schema could result in the interruption, termination, resumption of the
modied x-schema.
Example 12 Some examples of inter x-schema links
Figure 6.12 shows some implemented inter x-schema links. The rst three examples
are activation links, the following two examples inhibitory links and the nal two examples
are modifying links between the x-schemas shown. In each case, the required resource
or enabling Agent state f{struct value is shown immediately below the link itself. For
example, we have seen how Tripping results in the agent losing control of his own location
( :in control(loc) ), which leads to enabling a possible fall . This is the rst example
below. All the other examples are self-explanatory and should be fairly obvious to the
reader. The interesting examples are the case of concurrent enabling and disabling and
dynamic resource based enabling as in the case where the production of a specic amount
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 155
of energy ( Produce(energy x) ) enables the Walk x-schema (which requires the specied
amount to be able to start).
Denition 18 Activation
X activates Y ( (X Y ) : X Y 2 SS ) if executing X marks some subset p of ei-
ther pre-condition or resource places to the prepare transition ( T (Ye+ ) of Y ( p T (Ye+ ) ).
If the result state P (Xr ) of X marks p ( p P (Xr ) ), then we say X sequentially ac-
tivates Y . If the process stage of X activates Y ( p P (Xp) ), we say X concurrently
activates Y . X enables Y if ( p T (Ye+ ). Both activates and enables are non-
transitive, i.e. activates(X Y ) ^ activates(Y Z ) 6) activates(X Z ) (since Y may never
execute, or never complete execution). If activates(X Y ) ^ activates(Y X ) , then we say X
and Y are Mutually enabling. For inherently iterative activities T (Xe+ ) P (Xr ) .
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 156
Denition 19 Inhibition
Just as with activating relationships, the presence of a token in a specic inhi-
bition place can inhibit the triggering of an x-schema. Here we will be concerned mainly
with explicit inhibition through activation the some inhibitory place to an x-schema. Such
relationships will be called disabling relationships.
X disables Y ( (X Y ) : X Y 2 SS ) if executing X marks some subset p of
inhibitory places to the Enable transition ( T (Ye; ) of Y ( p T (Ye; ) ). If the result state
P (Xr ) of X marks p ( p P (Xr ) ), then we say X sequentially disables Y . If the
process stage of X disables Y ( p P (Xp ) ), we say X concurrently disables Y . If
If disables(X Y ) ^ disables(Y X ), then we say X and Y are Mutually disabling. For
inherently aperiodic activities (?) T (Xe;) P (Xr ) . Disables is non-transitive,i.e.
disables(X Y ) ^ disables(Y Z ) 6) disables(X Z ) (6.1)
Since Y cannot execute, X actually weakly activates Z .
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 157
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
SCHEMA 1 SCHEMA 2
Suspend
INH
a) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
SCHEMA 1 SCHEMA 2
INH
b) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
c) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done RES Enable PREPARE Ready START Ongoing FIN Done
PRE_COND
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
SCHEMA 2
SCHEMA 1 Suspend
INH
d) AGENT STATE
Figure 6.13: A sample of the various types of activation relationships between x-schemas.
In the concurrent activation case, the link between schemas is an enable link. In this case,
once Schema 2 is enabled, the execution of Schemas 1 and 2 proceeds independently.
Note that in all case only the controller state of the individual schemas are shown, the
reader is to assume that each of the schemas has internal structure not shown in the gure.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 158
Denition 20 Modication
The following relations pertain to schema X modifying the execution of schema
Y . While in the implemented model, only the result state of Schema X ( P (Xr ) ) is
responsible for the modication, obviously the links could be from other stages as well (for
instance, starting X could interrupt and ongoing Y , etc.).
Theorem 21 Composition
X-schemas are closed under a nite number of compositions using the relations
dened earlier.
Proof (sketch): We use the methods of converting x-schemas to Petri nets in
Chapter 3 and the following well known result from Petri net theory (Peterson
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 159
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
SCHEMA 1 SCHEMA 2
INH
a) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
INH
SCHEMA 1 SCHEMA 2
b) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend SCHEMA 1
SCHEMA 1 INH
c) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
INH SCHEMA 2
SCHEMA 1
d) AGENT STATE
Figure 6.14: A sample of the various types of inhibitory relationships between x-schemas.
In contrast to the case of activation, note that if Schema 2 is disabled during execution of
Schema 1 , it continues to be disabled until Schema 1 has completed execution or stopped
due to some other interrupt.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 160
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
Suspend
SCHEMA 1 SCHEMA 2
a) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
INH
SCHEMA 1 SCHEMA 2
b) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
Suspend
SCHEMA 1 INH SCHEMA 2
c) AGENT STATE
Enable PREPARE Ready START Ongoing FIN Done Enable PREPARE Ready START Ongoing FIN Done
RES
RESUME INTERRUPT RESUME INTERRUPT
PRE_COND
SCHEMA 1 Suspend
INH
SCHEMA 2
d) AGENT STATE
Figure 6.15: Relationships between executing x-schemas The relationship asserted depends
both on the execution state (marking vector) of Schema 2 as well as the structural linking
property. Only the sequential case is shown, where the the modication occurs as a result of
execution the modifying schema to completion. Of course, for all the relationships shown,
one could model the concurrent modication case as well, where the once the modifying
schema executes, it enables or disables specic transitions of the modied schema by setting
the appropriate state variables.
CHAPTER 6. A COMPOSITIONAL THEORY OF X-SCHEMAS 161
Lemma 22 (Peterson 1981) Petri net languages are closed under a nite number of ap-
plications, in any order, of the operations of union, intersection, reversal, concurrency, and
concatenation.
6.4 Conclusion
In this chapter we set up a computational framework for implementing embodied
theories. Our representation has two important and novel features. First, our compositional
theory retains the highly compiled, dynamic and active nature of x-schemas to implement
an active model of embodied and familiar domains. Second, we showed how the controller
x-schema abstraction, developed in Chapter 3 and used for interpreting expressions of lin-
guistic aspect allows us to construct a general purpose theory of x-schema composition.
Both these features allow us to use a uniform mechanism for sensory-motor control and for
real-time simulative inference.
Of course, it remains to be seen why such a reactive, dynamic system representation
is crucial for metaphoric reasoning about international economic policies. But before getting
there, we need to explain the ontology and our representation of the abstract domain of
international economics since the implemented program processes simple stories from this
domain. The next chapter deals with this issue. Chapter 8 discusses the implemented
model and goes through the working of the program in detail. Both these chapters are
fairly technical and serve to document details of the implemented program. Readers who
are more interested in what the model can do and the cognitive relevance of the enterprise
described in this thesis can read the rst three sections of Chapter 8 and directly jump to
Chapter 9, where we lay out the actual performance of the program on our database of
stories and discuss the wider relevance of our model.
162
Chapter 7
there is a long well-known history of integrating such networks with decision nodes to pro-
duce decision networks and inuence diagrams. This allows our architecture to seamlessly
integrate with current computational models in Economic Policy design and analysis, and
an interesting future study would involve seriously considering this possibility.
We start by describing the theory of PIN 's and the basic algorithms that operate
of such structures to propagate evidence and return the most probable explanation for the
observed data. This section is mostly for completeness and readers who already know
about Probabilistic Independence Networks can safely skip the section and go directly to
the section on the ontology of the target domain.
states, thus the values of a variable are mutually exclusive, but which state may be unknown
to us.
The most important aspect of PIN's is their ability to exploit the independence
relationships between variables. This is a structural property of such networks and is thus
not dependent on the specic quantitative uncertainty combination method used. The key
structural property is called the property of d-separation. In this exposition I follow the
recent text by Jensen (Jensen 1996) closely in describing this property.
A B C
a) Serial Connections
A B C
A
D
B C D
c) Converging Connections
b) Diverging Connections
Figure 7.1: The three types of inter-variable connections. In a) if the variable B is in-
stantiated (when the state of B is known through observation as evidence, or otherwise
clamped) A and C become independent. In b) (the divergent case), if A is instantiated it
blocks communication between its children ( B , C and D ) and this B , C and D become
independent if the state of A is known. In c), if nothing is known about D , except what
is known about D 's parents A , B and C , then the parents are independent. However
if there is some evidence that sets the state of D (or any of D 's descendents) then the
parents are no more independent.
Figure 7.1 illustrate the dierent kinds of structural relationships between con-
nected variables in a DAG . In general, the structural properties allow us to conclude the
following.
1. Evidence transmission is blocked in a serial chain when the state of the intermedi-
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 165
ate variable is known. The variables on either side of the clamped variable become
independent when this happens.
2. Evidence transmission is blocked in a divergent connection when the parent variable's
state is known. The various sibling variables become independent when this happens.
3. Evidence transmission is blocked in a convergent connection until the state of the
connecting variable or that of any of its descendents is known. Until this situation
happens, the various parent variables are independent.
entered, then P (AjB e) = P (Aje) . This allows us to use d-separation to read o conditional
independencies. This is used without further proof.
+
e
V X
e-
Y Z
The central idea in belief propagation is to split the conditioning evidence in the
two parts e+ and e; and compute the inuence of each separately. Using Bayes' rule we
get
P (xje) = P (xje+ e;)
= PP(x e; je+ )
(e; je+ )
= P (xjeP ()eP;(jee+j)x e )
+ ; +
= P (xPje(e;)Pje(+e) jx)
+ ;
and are also known as causal and diagnostic support, respectively (Pearl 1988). The
conditional belief then becomes
P (xje) =
(x)(x) (7.4)
where
is the normalizing constant P (e;1je+ ) = PP((ee)) (which need not be computed
+
explicitly). Note that
(and therefore P (xje) ) is only dened if P (e) 6= 0 , as expected.
The 's and 's can be computed by passing messages between parent and child
nodes. This is because of the following recursive relationships:
(x) =
X (y)P (yjx) X (z)P (zjx) (7.5)
y z
(x) =
X (u)P (xju) X (v)P (vju) (7.6)
u v
's on leaf nodes are initialized as follows: if a node X is instantiated (has known
value) x , then (x) = 1 and 0 for all other values. If no value is given for X , (x) = 1
for all x . The root node is instantiated 's representing the prior distribution on its
variable. (Evidence on non-leaf nodes can be incorporated by creating an extra leaf child
node on the non-leaf.)
Denition 26 Junction Trees (Jensen 1996) A junction tree JBN over the Bayesian
network BN is a tree whose nodes are clusters of variables from BN , which are cliques in
BN . Each link between any two variables U and V in JBN are labeled with a separator
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 169
which is the is the intersection of the adjacent cliques. Each clique and separator holds a
real-numbered table over the congurations of its variable set.
The junction tree property is said to hold if for each pair of variables U , V
that represent nodes in JBN , all paths between nodes U and V contain the intersection
V \U .
A junction tree is said to represent the Bayesian network BN if
A B A B ABC
AB
C D C D ABD
A A ABC
BC
B C B C
BCD
D D
1. Construct the Moral graph: the undirected graph with a link between all variables in
pa(A) fAg for all A . This step results in transforming the DAG's in Figure 7.3 a)
to the one in Figure 7.3 b).
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 170
2. Triangulate Moral graph: add links until all cycles consisting of more than three links
have a chord. Both moralized graphs in Figure 7.3 are already triangulated.
3. The nodes of the junction tree are cliques in the graph.
4. Connect the cliques of the triangulated graph with links such that a junction tree is
constructed. The junction tree for the two graphs are shown in Figure 7.3 c).
5. Initialize the cliques and separators with a table consisting of only 1's.. Then for
each variable, A nd a clique containing pa(A) A , and multiply P (Ajpa(A)) on
its table. The resulting junction tree represents BN .
P (w je) = max
w P (wje) (7.7)
With a slight modication, the JLO propagation algorithm described earlier can
be used to compute the MPE for the evidence at hand through local message passing on
P
the junction tree. The basic modication is that in the absorption phase, the is replaced
by a max operation. The new equation is
ts = max t
W nS w
tv = tv tts
s
tions of the world described by the belief networks. The basic algorithm begins by choosing
a values for the root variables (the prior slice) based on the prior values. We continue to
chose values to child variables based on the conditional probability distribution, till we hit
an evidence node. We count the number of times both the variable of interest and the
evidence node were of the instantiated value and divide this by the number of times that
the evidence node was incorrectly sampled. In the limit, this number approaches the true
probability of the node of interest given the evidence.
The MPE and Stochastic Simulation algorithm was used to update the temporal
belief net corresponding to the reader's knowledge of international economics. The MPE
and BELIEF UPDATE algorithms used in this thesis were adapted from the public
domain Belief Network package ideal (Srinivas & Breese 1990).
With this background, we are ready to see how the knowledge about international
economic discourse is encoded in a Belief network framework. But before describing the
encoding, we briey we need to know in a bit more detail what we are encoding, namely
the domain of international economics.
natural resources.
In an open economy, trade plays an important role in contributing to the overall
gdp. An open economy can be achieved through the process of internal liberalization. Inter-
nal liberalization is the removal of any taris, import controls, or other trade restrictions.
Open economies are also commonly referred to as Trade economies for the reason cited
above. Open economies may result from external demand driving up internal production
or may be may be a result of increased supply.
The terms closed and open economies describe a state of aairs. Various devel-
opment strategies or policies may inuence these states. In general, economic policies may
be inward oriented or outward oriented(Bradford 1991). Inward oriented policies favor the
domestic market whereas outward oriented policies favor trade and exports.
The extreme version of inward orientation postulates delinking or isolationism
leading to a completely closed economy. The extreme version of outward orientation results
in subsidizing export industries leading to an export push. In an export push economy, the
state may interfere to set domestic prices and exchange rates to provide more incentives for
exports rather than imports or domestic consumption, in eect subsidizing exports.
Import substitution refers to the situation when a state interferes to discriminate
against imports to favor domestic production. Import substitution policies may be imple-
mented using multiple exchange rates, taris, and other control measures. The bias against
imports may vary from complete protectionist to more selective policies.
One intermediate form of outward orientation is internal liberalization that leads to
an open economy. In an open economy the "incentives to import as as great as the incentives
to export" (Bradford 1991). In such an economy the import and export exchange rates are
likely to be the same.
At any given time, a country would fall into one of the aforementioned clusters.
Adopting a specic policy, a country may transition from one cluster to another.
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 175
utterance asserts a policy change (say from a protectionist to a liberal economy), then the
corresponding implications (goals change from restricting trade to free-trade, etc.) should
be available for the next utterance. Furthermore, as we will see in Chapter 9, we may
often have to propagate the inuence of specic assertions back as well as forward in time
to construct an overall explanation of the discourse fragment in question. For all these
reasons, we chose to use a temporally extended Belief network (Dean & Wellman 1991)
with one network copy representing the result of processing each input utterance (event
description). Figure 7.4 shows the implemented temporal belief network.
The structure of the target domain for three temporal slices of the Belief network
is shown in the top part of Figure 7.4. We assume that the Belief network nodes are discrete
multi-valued variables. There is one copy of the target domain net for each relevant time
step (each input utterance). The dashed link that connects the t ; 1 copy to the t th
copy of the target domain belief net tends to keep the value of the variable unchanged in
the absence of other intervening factors and is called the persistence link.
Of course, in any real discourse processing, earlier slices would be reclaimed and
only a small set of copies (a window size) would be kept. The actual number would be
determined by working memory capacity limitations. However, since our discourse frag-
ments encompass at most 3 event descriptions (implying at most 5 temporal slices, one
representing the prior (background) knowledge, and one for projecting into the future from
the discourse fragment).
The Bayesian network that encodes the target domain has variables for the dier-
ent policies and transitions discussed in the last section. For instance the policy variable
has the ve options discussed earlier (Ec. Policy = (autarky, isolation, import substitution,
liberalization, export push)) and the change variable Do Change has the 6 transitions
discussed earlier. Besides there are variables that represent the Actor/Agent of the policy
in question (values include US Gov., Indian Gov. Business, Economists, and International
Banks (IMF and WB)). Besides there are variables that correspond to the outcome (suc-
cess or failure) of a given policy, whether there is any diculty in plan implementation and
whether the policy leads to free-trade and deregulation or trade restrictions.
In Figure 7.4, the shaded nodes for the target domain belief network at time
T = 0 , represent prior values for the relevant economic policy variables. For instance, the
prior expectation is that the actor in question is the US Government ( USGov: in Figure 7.4)
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 177
Policy
Do_change
None = .7
difficulty
False = .7
Policy Status
Outcome
Goal
Progress
Deg. Complete
T= 0 T= 1 T= 2
Figure 7.4: A temporally extended Bayesian network that represents knowledge about
international policy discourse outlined in the last section. As mentioned earlier, there is
one network slice for each input utterance. Since the network has multiple temporal slices,
it is a temporally extended network. The gure shows a network with three temporal slices.
The slice labeled T = 0 is the network prior . Links shown in dotted lines represent
persistence links, i.e. the chance the variable of interest (say the ACTOR variable) takes
on a specic value (say US Gov ) at time t (say t = 1 ) given that it takes on a specic
value at time t ; 1 ( t = 0 ). Nodes in the network represent discrete and multi-valued
variables. For instance, the variable entitled Policy in the gure takes on one of the 4
values !autarky, protectionist, Liberlization, and Export push] described in the text. Links
between variables in a single temporal slice represents the causal dependence of the state of
one variable on another. For instance the variable Goal is dependent on the type of policy
followed (for example free-trade and deregulation for liberalization policy).
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 178
and that the the no change in policy in asserted ( Do Change = None ) in Figure 7.4. This
may be overridden by an input f{struct asserting \Actor = Indian Government", in which
case the Actor node is clamped to the value IG .
The persistence links encode the conditional probability of a variable's value at
time t , given it's value at t ; 1 . One interesting point is that the aspect graph transitions
are implicitly coded in the target domain as the conditional probability of being in a specic
control status at time t given the control status at time t ; 1 . For instance let us look
at one variable, namely the probability that the status of a schema is ongoing at time t ,
given the status at time t ; 1 .
All other values, such as P (ongoing (t)jdone(t ; 1)) are negligibly small ( 0 ).
These values are default values and are often overridden by specic assertions as we will
soon see in detail with the case of \stumbling". Also, the inferential machinery described
in the last chapter can go backward and forward in time to answer questions such as \If the
negotiations were ongoing at utterance time t , what is the probability they were ongoing
at time t ; 1 , or that they are done at utterance time t + 1 ?"
Figure 7.5 shows the impact of asserting evidence of the Belief network. The rst
thing to notice is that evidence can be asserted by clamping any node (shown in black) at any
of the time steps. This will be crucial in Chapter 8 since a single embodied lexical item (say
stumble) is able to assert that the previous state corresponded to an ongoing walk , while
the current state is an interrupted walk . As we will seen in Chapter 8, this gets projected
onto two slices of the target network, one that asserts an ongoing status (Figure 7.5) at the
previous copy, and an assertion of an interrupted status at the current copy. Figure 7.5
shows this situation.
In order to completely specify the network shown in Figure 7.4, we need to specify
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 179
Policy
do_change
difficulty
Policy Status
Outcome
Goal
Progress
Deg. Complete
T= 0 T= 1 T= 2
Figure 7.5: Asserting evidence on the target domain network and subsequent updates. First
point to note is that a single network intervention (from linguistic input in our case) can set
evidence (shown as dark nodes in the network in more than one time slice). In the example
shown, evidence has been set for the presence of a diculty at time T = 1 , the Actor has
been instantiated to IG (Indian Government), and the Policy Status is set as ongoing .
CHAPTER 7. A BELIEF NET MODEL OF INTERNATIONAL ECONOMICS 180
the prior probabilities at time 0 for root nodes in the Bayesian network, and the various
conditional probabilities of both the persistence and inter-state correlation links. This
is shown in Chapter 11, Section 11.3. Once we specify the various network parameters,
the BELIEF UPDATE and MPE algorithms can compute the impact of any new
intervention in the network, such as new evidence on all variables in both in the current
state as well as previous and future states. The use of this method to retrieve explanation
for narratives is discussed in the next chapter.
181
Chapter 8
A Computational Model of
Metaphoric Reasoning
The central hypothesis pursued here is that the meaning of motion and manipu-
lation words is grounded in ne-grained, dynamic representations of the action described.
Systematic metaphors project features of these representations onto abstract domains such
as economics enabling linguistic devices to use embodied causal terms to describe features
of abstract actions and processes. In this chapter and in Chapter 9, we attempt to provide
evidence for the following proposition.
Proposition 2 . The role of Embodied Metaphors
Embodied metaphors project features of spatial motion and manipulation onto abstract plans
and processes. This allows event descriptions to exploit the highly compiled and real-time as-
pects of x-schema representations to express complex, uncertain, and evaluative knowl-
edge about abstract domains such as international economic policies.
In our theory, embodied metaphors such as the Event Structure Metaphor (Lako
1994) use the dense and familiar causal structure of spatial motions to permit reex (fast,
unconscious) inferences. In Chapter 6, we saw how our x-schema based representation was
able to encode theories of embodied and highly familiar domains in manner that retains
the dynamic and highly responsive real-time nature of x-schemas and their ability to model
reex inferences. In this Chapter and in Chapter 9 we will detail how projection of x-schema
based inferences can add valuable information about abstract plans and events in real-time.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 182
A second type of map projects events, actions, and processes from embodied to
abstract domains. In keeping with our representation, we will call such maps Schema maps
or SMAPS . An important function of SMAP projection is to invariantly map the
aspect of the embodied domain event onto the target domain. This is done by mapping
the controller status of the motion event to create evidence for the status variable
of the target domain network at the appropriate time slice. A common example of an
SMAP is the well known event structure (Lako 1994) mapping Motion ) Action . This
map projects self-propelled spatial motion onto the domain of plans as action/plan/policy
implementation. Crucially, the aspectual phase of motion (ready , ongoing , done , suspend ,
etc.) is mapped as the status of the abstract action described. Recall from Chapter 7, how
the use of a temporally extended network allowed us to store aspectual state information
about the abstract event through the status variable and the possible transitions of the
controller as conditional probabilities of the temporal links (links between time slices)
connecting the status variable. SMAPS use this feature to project events and their
aspect from familiar to abstract domains.
As we described in Chapter 3, a central component of x-schemas is that they
are parameterized routines with run-time parameter and dynamic variable binding. We
found specic x-schema parameters to be routinely projected to abstract domains. In our
model, these projections are accomplished by x-schema parameter maps ( PMAPS ). Com-
mon examples of such projections include the map Rate of motion ) Rate of progress ,
Distance traveled ) Degree of Completion , and Size of step ) Amount of Progress .
The rst PMAP above projects rates of motion onto the abstract domain as the rate of
progress made, the second one projects distance traveled onto the abstract domain as degree
of completion of a plan, and the third projects size of a step to be the amount of progress
made. All these maps are frequently used in discourse about economics (for ex. crawling is
projected as a low rate of progress, giant steps is projected as a good amount of progress
in plan implementation, setting out refers to the start of a plan, etc).
SP-EVAL(IMP)
Feature-value instantiation
Context
SP-EVAL(POL.MKR)
DOCTOR ACTORS OBSTACLES
MOVE FALL => FAIL
IS
IS ACT ARE ARE
POLICY
MAKER MOVERS DIFFICULTIES
SMAP UT-TYPE
OMAP OMAP
OMAP SMAP
METAPHOR-MAPS
Speaker-intent DOMAIN
Context-sensitive Activation
DISCOURSE
EMBODIED DOMAIN F-STRUCT BINDING
BELIEF NET
PAIN DISEASE HEALTHY INJURED WALK BUMP CONTROL MOVER ENERGY FALL DIST. BARRIER REM OBS WALL ROADBLOCK
Triggering Inference
CURE(DISEASE) STEP(RATE)
FALL
REMOVE(obstacle?)
QUACK
STABILIZE MOVE(Obstacle?)
PRESCRIBE(REMEDY) TURN(DIR)
Figure 8.1: Metaphors capture systematic correlations between features of dierent do-
mains. Shown in the gure is a small fragment of the x-schema-based source domains, a
fragment of Bayesian network modeling the target domain and dierent maps projecting
from source features to target features by setting evidence for specic nodes. Note the
special variables for the discourse binding. In the gure, the bottom half corresponds to
the x-schema inference model described in Chapter 6 . The top part of the gure refers
to the temporally extended network representing the system's knowledge of the domain of
international economic policies. (ref. Chapter 7 ). The middle part of the gure is new
(shown in boldface) and contains the various types of metaphoric maps that project familiar
source domain events and actions onto abstract events and plans.To the right of the gure
is another new temporally extended belief net entitled Discourse Binding Belief Net.
Nodes in this net code for various communicative intent and evaluatory information often
derivable from the choice of embodied terms.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 185
Figure 8.1 shows the basic computational architecture of the implemented system.
As shown in the gure the system has four main components.
1. The top part of Figure 8.1 shows the temporally extended Belief network represen-
tation of the reader's knowledge of international economic policies. The dierent
temporal slices have default persistence links between them which can be overrid-
den by new evidence. The Belief network at time t0 corresponds to the prior or
background knowledge of the reader.
2. The bottom part of Figure 8.1 shows a set of x-schemas and the corresponding em-
bodied domain f{struct, whose feature values both trigger x-schema execution and
are set or modied as a result inference from x-schema execution.
3. The middle part of Figure 8.1 (shown in bold) shows the dierent kinds of hybrid
metaphor maps which when activated based on the discourse context map embodied
feature-values as evidence for specic feature-values in the economic policy Belief
net. Such static maps may map objects and roles (called OMAPS in Figure 8.1),
activation and execution status of x-schemas (labeled SMAPS ), or parameter-values
(labeled PMAPS ) from one domain onto another. The maps are context-sensitive
in that they only project values if the domain of discourse is known to be about the
target (economic policies in this case).1
4. On the right of Figure 8.1 is shown a set of multi-valued features labeled Discourse
bindings. These nodes are also modeled as probabilistic nodes in a temporally ex-
tended Belief net. One variable pertains to the domain of discourse, while the others
exploit an interesting feature of embodied terms the felicity with which they encode
speaker evaluation and intent which are integral to discourse comprehension, and
absent from any previous model on metaphor that we are aware of.
We have seen in detail how the embodied domain inferences work through x-schema
execution and simulation. We have also seen how target domain knowledge is represented as
a Belief net and the various algorithms for belief update and revision that we use in the work
described here. In this Chapter, we describe how the various pieces t together through
1
In the discussion to follow we will use the convention of depicting OMAPS (corresponding to role-
ller bindings) as rounded rectangles, PMAPS (mapping x-schema parameter values) as rectangles and
SMAPS (mapping the aspectual component of events across domains) as hexagons.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 186
projection structures (metaphor maps) that are able to mediate between the abstract and
embodied domains. We will then be in a position to describe in detail the use of the overall
model is used for on-line interpretation.
The target domain representation models the non-economist reader's knowledge
of the domain of Economic Policies. The structure of the target domain for two temporal
slices of the Belief network is shown in the top part of Figure 8.1. There is one copy of the
target domain net for each input utterance. We will assume that the narrative has the same
temporal order as the event occurrence order. The representation of tense and narrative
time is not addressed in this thesis. The link entitled persistence that connects the t ; 1
copy to the t th copy of the target domain Belief net by default leaves the values of the
policy variables unchanged unless a) new evidence is set to change a value as a result of
metaphoric inference at the current time step, or b) a policy change was asserted directly
or through metaphoric inference at the last time step.
In Figure 8.1, the shaded nodes for the target domain belief network at time t ; 1 ,
represent prior values for the relevant economic policy variables. For instance, the prior
expectation (before any linguistic input) that the actor in question is the US Government
( USG in gure) and that the economy is experiencing moderate growth ( EC:STATE =
mod:growth ) is encoded by setting these nodes to these value distributions in the slice
corresponding to t = 0 . As we will see, once linguistic input comes in some or all of
these values may be modied, resulting in a dierent posterior distribution for each of the
relevant variables.
The persistence links encode the conditional probability of a variable's value at
time t , given it's value at t ; 1 . These values are default values and are often overridden
by specic assertions as we will soon see in detail with the case of \stumbling". Also, the
inferential machinery described in the last chapter can go backward and forward in time to
answer questions such as \If the negotiations were ongoing at utterance time t , what is
the probability they were ongoing at time t ; 1 , or that they are done at utterance time
t + 1 ?"
As an example of linguistic input overriding the prior value of a variable, consider
the case where an input asserts \Actor = Indian Government" at time t = 0 . As a result of
processing the input, the model is able to clamp the Actor variable to the value IG for the
slice of the network t = 1 . In the belief update phase (using the algorithms described in
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 187
Chapter 7), the inuence of this evidence at time t = 1 , propagates backward to the slice
at time t = 0 , and results in setting the Actor variable to the value IG , overriding the
network prior value of USG explained above. While linguistic input often results in changes
to priors (after all that is what new information routinely does), it is important to note that
our system is able to model cases where prior information can modify the interpretation of
a specic utterance. Chapter 9 has interesting examples of this phenomenon, where prior
information plays a role in which x-schema runs, and thus which inferences are generated
and projected as updates on the target domain network.
Comprehending a story corresponds to nding the set of trajectories (high proba-
bility feature-values for missing features implied by the input) that satisfy the constraints
of the input and are consistent with the domain knowledge (encoded both as inter-variable
links and temporal links in the target net). This may involve lling in missing values for
the target domain features, as well as inferring values for unmentioned target features im-
plied by the story. The most probable trajectory can then be retrieved as the most likely
explanation of the story. Features with highly selective posterior distributions are likely to
be present in the recall of the story. A complete trajectory infers values for features com-
prising the reader's state taking as evidence nodes at specic times that are clamped due to
x-schema inference or linguistic input (or both), and re-estimating values for these features
at other times. We use the belief update and revision algorithms described in Chapter 7 for
this purpose. In Figure 8.1, we see that the two domains of relevance, namely the target
domain of Economic Policies and the embodied domains of health and spatial motions have
their own domain theories. For instance, in the domain of spatial motion, we may expect
that the presence of a specic type of obstacle while executing a step may cause the mover
to stumble. Thus the concept of stumbling is encoded by activating the the x-schema for
walk with status = ongoing at time t ; 1 , and asserting a bump at time t .
In all further discussion, we assume the presence of a systematic correspondence
across conceptual domains, that is characterized by a xed system of conceptual metaphors
(Lako & Johnson 1980 Lako 1994) which govern how abstract concepts and domains
are understood in terms of the more concrete and "experiential" domains. In our model,
such metaphors are rst class objects that are nodes capturing inter-domain constraints
and correlations. As the gure shows, the dierent types of embodied metaphor maps are
hybrid structures linking concrete domain features and values and target domain features
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 188
and values. One one side of the map lies the x-schema based perceptuo-motor represen-
tations, which interact with the embodied f{structure, performing rapid, reex reasoning
by getting, setting and modifying the embodied f{struct values while executing a mental
simulation of the event in question. On the other side is the Target Domain Belief Net,
which captures the naive reader's knowledge of the domain of international economic poli-
cies. The various maps interact with this structure setting evidence based on inferential
products from x-schemas, as well as direct input from the parser.
Recall that our model allows three dierent types of maps. An example of a
schema map (or SMAP ) is the Event Structure projection Moving ) Acting (shown as
the hexagonal node labeled Moving IS Acting in Figure 8.1). A novel feature of SMAPS
are that they are able to naturally map aspectual inferences from embodied to abstract
domains. Other types of maps modeled include ontological maps or OMPAS (object-to-
object or role to role maps) such as the map Actors ) Movers , as well as a new type of
map that projects specic x-schema parameters between the concrete and abstract domains
(such as distances traveled in the spatial motion domain onto degree of completion of an
economic policy). Such maps are called parameter maps or PMAP s in our model. We note
that earlier models of metaphor have either focussed on event-to-event mappings(including
sub-categorization frames) (Martin 1990 Carbonell 1982) or on object-to-object maps (Sun
1994). Our model allows both types of maps to be modeled. In additional, the capability of
projecting aspectual inferences through SMAPS and projecting sensory-motor parameters
to abstract events through PMAPS are unique to our model. Each of the maps described
above has its own execution logic which is detailed in Section 8.2.
The box on the right in Figure 8.1 labeled discourse binding belief net con-
tains the variable that controls projections from concrete to abstract domains based on the
domain of discourse. This is represented by the Belief net variable labeled domain . In gen-
eral, domains are arranged in a type hierarchy, and identication of the domain is part of the
parsing process. We will mostly assume prior knowledge that the domain of discourse is one
of four possible values (namely international trade policy, economic state description, or the
concrete domains of health and spatial motion). While we allow the parser to be ambiguous
about which value is actually instantiated, we have not implemented any type inference over
domain hierarchies, and for our implementation we assume a single multi-valued variable
domain to represent the result of the domain identication process.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 189
Interestingly, as the program was being developed and trained on the database
of stories, it became quite apparent that one central reason to use embodied terms is to
compactly code communicative intent and evaluatory information (opinions, propaganda,
judgments). The two variables entitled Ut ; Type (with four possible values description ,
opinion , propaganda , and assertion ) and Speaker Evaluation of the specic policy, or
actors in a specied event description (with values neg ++ , neg , neutral , pos , poss ++ )
are meant to capture some of this aspect of embodied term usage. There are links labeled
Speaker intent which directly set evidence for the Ut ; type and Speaker ; Eval variable
based on specic values of the embodied domain f{struct. Thus x-schema executions are
able to directly inuence and set evidence for various values of this variable. Also, there
is a Belief net link from Speaker ; Eval to Ut ; type captures the dependency of the
utterance type on the extent to which the speakers evaluation of the situation is present
in the utterance. For instance, the fact that the speaker evaluates a situation as highly
negative (positive) inuences that utterance type variable in that the utterance is more
likely to be an opinion or propaganda than just a description. Obviously, the coding of
intent in metaphoric usage is a much deeper issue, and results both from the training set
and the test set indicate that further investigations in this area are likely to be highly
productive. Chapter 9 outlines some interesting results that can be obtained even from
this limited implementation. Chapter 10 outlines possible applications of our approach for
speech act recognition.
Figure 8.2 shows the portion of the implemented network in action while inter-
preting the input utterance Indian Government stumbling in implementing Liberalization
Policy. The bold entries correspond to direct intervention from the input as well as from
x-schema inferences. Note that the OMAP Actors ARE Movers is active with value IG ,
resulting in a token of color < IG t > in the mover place. Note also, that we assume that
the domain variable has been instantiated to the value Economic Policy, thus setting the
context for activating various maps.
We saw in detail in Chapter 6 how the lexical item stumble codes for the presence
of a bump that interrupts an ongoing walk x-schema. We saw how the x-schema based
simulative inferences then result in bindings that specify an ongoing step being interrupted,
leading to the agent tripping over the bump. Subsequently, the agent becomes unstable,
and may be able to stabilize , depending on the available energy and other resources, or
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 190
INFLATION EC. STATE POLICY STATUS ACTION ACTOR OUTCOME DEG. COMPLETE DIFFICULTY
<ONGOING>
INFLATION EC. STATE POLICY STATUS ACTOR OUTCOME DEG. COMPLETE DIFFICULTY
ACTION SP-EVAL(POL)
Lib.
<IG>
SP-EVAL(IMPLEMENTER)
<SUSPEND>
SP-EVAL(POL.MKR)
DOCTOR
MOVE ACTORS OBSTACLES
IS
=>
FALL => FAIL ARE
POLICY ARE
MAKER ACT MOVERS
DIFFICULTIES UT-TYPE
Desc
DOMAIN
Triggering Inference
WALK(RATE, DEST)
Figure 8.2: The system is shown interpreting the sentence Indian Government stumbling in
implementing Liberalization Policy The entries in bold on the Belief net are evidence that
has been directly asserted as part of the input (for instance the domain node is instantiated
to the value \Economic Policy") as a result of the parsed input. Active Metaphors and infer-
ential projection result in some other nodes being instantiated, for example, the suspension
of walking due to the presence of a bump sets evidence for status of the policy to be one of
suspension, and the presence of a bump activates the PMAP , Obstacles ) Difficulty
thereby asserting the presence of a diculty in the planning process.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 191
disabling signal
Metaphor
AND Csi
domain
Ctj
Figure 8.3: Representing a metaphor. When active, the metaphor can potentially project
the value of a source concept to the corresponding value of the target domain concept.
Transfer can be blocked if the target domain knowledge is incompatible with the projected
value.
Figure 8.3 shows the general structure of a metaphor node. It constrains the pro-
jectability of source domain concept Csi onto a target domain concept Ctj . In this case
shown in Figure 8.3 both source and target domain concepts are binary but they can be
multi-valued. Multiple values are represented as colors on tokens on the source side and
multi-valued random variables in the target Belief net. Thus in the absence of specic dis-
abling signals, conventional metaphors project feature-values from source to target concepts
by setting the relevant nodes in the target domain Belief net to specic values. Projection
does not change the instantiated value in the target domain. As was noted in the example
shown in Figure 8.2, the maps are able to assert evidence to dierent temporal copies of
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 193
Function Description
Return the specic instantiated value
CHECK CONTEXT (t) (if any) of the target node target at
time t or nil (if none)
Return TARGET if the domain node
GET EV IDENCE (target t) is instantiated to Economics or if there
is an active connected map
SET EV IDENCE (target x t) sets the node target (time-slice t) to
value x
MARK SOURCE (< x t >) mark the source input place with a to-
ken of type < x t >.
CREATE TOKEN (< bindname timestamp >) create a token of type specied by the
tuple < bindname timestamp >.
GET TOKEN (< t >) returns the token value from the
source node of the map at time t.
GET STATUS TOKEN (source ; schema t) Gets
state
the source x-schemas controller
GET STATUS (map) Get the specic map's (any kind) sta-
tus. Returns active or nil
SET STATUS (map) Set the status of the map to active or
nil.
EXECUTE OMAP FUNCTIONS (omap t) Binds Target Roles/Objects to Source
Roles/Objects
EXECUTE SMAP FUNCTIONS (smap t) Projects schema names and aspect
from source to target
EXECUTE PMAP FUNCTIONS (pmap t) Projects
to target
parameter-values from source
<IG, t> )
ACTOR = IG
, t>)
ge ( <IG
t_e
vid <IG, t> <IG, t> token
en ea te_
ce OMAP (cr
(ta e
rg urc
et,
t) rk _so
ACTOR(x) = MOVER(x)
ma
get_evidence(context, t)
)
x t, t
nte
(co
ce
en
OR
vid
t_e
ge
OTHER MAPS
DOMAIN = Ec. Policy
All maps become active only in the context where domain of discourse is about
Economics (note that OMAPS have an input from the model variable Belief Net node
domain , which is checked for the instantiated value), if the target node is instantiated
(from the input utterance), the binding is propagated to the source node by generating a
unique token that corresponds to the specic binding in question.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 196
have residual activity for the next time step. This allows for projection even in cases where the parser has
been unable to establish the target context.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 197
Metaphor produces sets evidence to the Actor node, or consumes and produces a token
of color x at every step of the simulation.
Of course, to avoid generating redundant tokens every time this function is called,
the source node (f{struct place) is checked for the presence of a token of the specied color.
If there exists such a token, no new one is generated. The color check is required since the
system allows more that one type of agent to be simultaneously referred to in the domain
of discourse, thus the color carries variable binding information in the x-schema.
We note that an interesting reason for such hybrid parameter maps is that they are
able to change the granularity of distinctions, so an integer number of tokens in the source
side may be mapped to a much lower resolution target value. Typical examples are cases
where numerical measures such as rates, or distances, or energy levels are mapped onto
the Economic Policy domain as dierent probabilities of low , med: high (or a 5 point
scale with low; and high+ ). For instance, the PMAP mapping distance from source
(ref. Figure 8.1) to degree of completion, maps distances (values from 1 to 7 , modeled as
number of tokens) to degree of completion (three valued Belief net node) was sucient to
make the appropriate distinctions for the examples we looked at.
PMAP
get_evidence(context, t)
low .7 .6 .4 .2 .1 .1 .1
med .2 .3 .5 .7 .7 .5 .1
high .1 .1 .1 .1 .2 .4 .8
interesting thing to note is that this is a projective mapping where the control status of the
source x-schema is invariantly mapped to the status value for the action. The rich inferences
that result from changes in agent state as a result of control transitions in perceptual and
motor actions, reect in activating many metaphor maps and thereby allow for the new
evidence and bindings to be entered about the target domain regarding goals, outcomes,
resources, and diculties in plan implementation.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 200
The example in Figure 8.7 allows an interpreter to infer that moving forward in the
context of an economic policy corresponds to making progress in the policy implementation.
Of course, the aspect of the underlying schema is projected as well. Thus in the example
above the dierence in is dealing with the obstacle is mapped invariantly onto the target
domain as monitoring some diculty and taking corrective action. versus has dealt with
the obstacle is mapped as the setting evidence of difficulty = false .
The combination of SMAPS and PMAPS are able to map parameterized
events. For instance the combination of the PMAP forward IS progress and the SMAP
above, move(forward) ) make(progress) , move(back) ) regress , etc. In combination
with OMAPS , SMAPS are able to map entire verb-sub-categorization frames and rela-
tional structures from one domain to another, such as in the example shown in Figure 8.1 we
are able to map energetic movers (the agent of Moving) to wilful actors (the agents of act-
ing). Similarly in the example described in the last section the combination of the SMAP
fall ) fail and the OMAP s Actor ) Agent , States ) Locations , and Recessions )
Holes , the system is able to map the relational structure fall ; into(Brazil Recession)
to fall ; into(Br : Agent recession : hole) .
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 201
set
_e
vid
en
Status(t - 1) ce
(st
atu
s=
su
sp SMAP
suspend
en
d,
t)
Status(t)
MOVING
WALK
IS
ACTING < SUSPEND, t>
X-SCHEMA SIMULATION
Implement(t)
<ONGOING, t - 1>
1)
, t-
WALK
ent
get_evidence(context, t)
1)
m
ple
t-
e( im
xt,
enc
nte
id
_ev
(co
set
ce
en
vid
t_e
ongoing
ge
WALK
Persistence link
Figure 8.7: Mapping events across domains over multiple time steps. The gure shows
invoking the EXECUTE SMAP FUNCTIONS function for the SMAP Moving )
Acting twice once with the time argument at the previous time step t ; 1 , and once in
the current time step t . In the rst case, the ongoing aspect of motion at t ; 1 and the
suspend aspect at time t is projected as a policy implementation that is ongoing at t ; 1 ,
and suspended at time t respectively. The broken lines between the status and policy nodes
(to the left of the gure) represent the temporal persistence links between network copies
(their inuence is overridden through direct evidence being asserted at both time steps).
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 202
4
In future I/O displays, we only display prior values if they are relevant to the processing of the input
utterance. Otherwise, the reader is to assume that the economic variables have the same prior values as
shown in Chapter 11, Section 11.3.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 203
Table 8.1: I/O Behavior for the input Indian Gov. Stumbling in implementing Liberalization
plan. The values are shown for the relevant priors, input, output, and for active metaphor
maps.
FEATURE V (t0) V (t1) V (t1+ )
PRIOR
Actor = USG t(.7)
Domain =Ec. Policy t (.7)
INPUT FEATURES
Event = stumble t
Domain =Ec. Policy t
Policy = Liberalization t
Actor = IG t
Aspect = Present-Prog t
Ut-type = Description t
ACTIVE MAPS
Mo (Mover ) Actor)) A A
Ms(Walk ) implement(policy )) A A
Ms(Walk(ongoing) ) Status = ongoing ) A
Ms(Walk(suspend) ) Status = suspend) A A
Mp (Obstacle ) Difficulty) A A
TARGET(OUTPUT) FEATURES
Event = stumble t
Domain = Ec. Policy t(.8) t t (.8)
Ec. Policy =Liberalization t(.8) t t (.8)
Aspect = Present-Prog t
Actor = IG t(.8) t t(.8)
Status(Pol) = ongoing t f t(.3)
Status(Pol) = suspended f t t(.7)
Diculty t(.7) t t(.7)
Outcome = succeed t(.6) t(.4) t (.5)
Goal = free-trade ^ deregulation t t t
A note about the display method. In cases where I use a table to display the results
of the program on a specic example, both input and output f{structs are shown in a tabular
form. In most tables, I have changed multi-valued variables to the value of the instantiated
(or high probability) state to label the appropriate row of the table, so that cell entries
only convey the probability of that specic state (as true t or false f ). The individual
columns of the table represent the time slices. Cells in the tables indicate the value of
the relevant feature at that time slice. When appropriate the actual degree of belief for
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 204
the specic value at the specic time slice is shown is brackets as well. Unless otherwise
specied target values whose posterior probability falls below the threshold value are not
shown (threshold was .6 unless otherwise specied). Cells that represent the cases where
metaphors are active, are labeled with A . Cells with no entries represent inestimable or
irrelevant values. The specic target domain feature-values that became active as result of
metaphoric inference and associated propagation in the temporal Belief net are shown in
bold. In cases where the input contained the value of a feature that had to be re-estimated
for other times, only the re-estimated values are shown in boldface. An example of such
a feature is Domain = Ec:Policy , where the input f{struct supplied this information at
time t = 1 , which had to be reestimated at time t = 0 , etc. Only the re-estimated values
are in boldface. The conventions above are followed in Chapter 9 as well.
The pre-parsed input is shown as setting the values of relevant features at time
t = 1 . These values appear in Table 8.1 under the heading INPUT FEATURES. Since
the input is available at time t = 1 , there are entries only for that time slot. Notice
that the input essentially instantiates specic target features (such as Actor = IG ) and
also contains aspectual and source event information. This information is used to set the
controller status and the relevant x-schema parameters, possibly triggering execution.
The result of processing the input in Table 8.1 is a set of new bindings asserted
in the target domain resulting in an updated posterior for other variables. As part of
the processing of the input various x-schemas may execute and various metaphor maps
may become active transferring bindings and projecting x-schema inferential products. For
example Table 8.1 shows the OMAP Mo (Mover ) Actor)) , and the Ms (Walk )
implement(policy)) along with the aspectual maps Ms (Walk(ongoing ) ) Status = ongoing )
which maps the ongoing status of the walking event at time t = 0 onto the target by setting
evidence for an ongoing policy implementation at time t = 0 , and the Ms (Walk(suspend) )
Status = suspend) aspectual map that projects the suspension of the walk process at time
t = 1 onto the economic domain as a suspended policy implementation. Also active is the
PMAP Mp (Obstacle ) Difficulty ) which asserts a plan diculty. All these projections
are available in real-time as x-schema inferences.
Once x-schema inferential products are projected, the BELIEF UPDATE al-
gorithm propagates the eect of the new evidence throughout the temporally extended
network. Bold entries under the heading TARGET (OUTPUT) FEATURES in Table 8.1
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 205
above correspond to cases where the value of the relevant variable was a result of projecting
x-schema inferences. Of particular interest is the context setting inference which projects
the embodied knowledge that stumble occurs as a result of an obstacle while executing a
step (causing an interruption to forward motion) to the target as plan diculty (causing a
temporary suspension). Another interesting binding occurs as a result of the source domain
knowledge that stumble may lead to a fall which is mapped onto the target as an enhanced
likelihood of plan failure. Thus we note that while stumble is not directly mapped in our
system as a meaningful concept in the domain of Economics, through inferential pro-
jection from maps such as Falling MAPS TO Plan Failure and Obstacle MAPS TO Plan
Diculty the system is able to assert a target context where an ongoing plan is experiencing
diculty increasing the chance of failure as the outcome. This kind of real-time defeasable
inferential projection is a novel feature of our model and we believe a crucial requirement
for any model of metaphor interpretation.
Of course, many possible x-schema bindings, especially those that don't activate
any conventional metaphor are invalid are thus have no impact on the agent's epistemic
state (for example the source inference stumble ) losing balance). Thus the inferences
that are actually made are context-sensitive and depend on the target domain and the
associated set of metaphoric maps.
The resultant target network state shown in Table 8.1 is now a prior for processing
the next input at stage t = 2 . Background knowledge is encoded as the network state at
t = 0 . Potentially target inferences can go forward and backward in time in the estimation
of the most probable explanation of the input story.
(current evidence is at time t ).5 The TOP LEV EL CONTROL function is waiting for
one of three kinds of input. One is a request to process a new input f{struct, the other two
input pertain to making updates on the target domain networks and returning the highest
posterior trajectory (from algorithms described in Chapter 7).
INTERPRET INPUT (I )
input: I is the input f{struct
Begin
INCORPORATE INPUT (I t)
Enter new evidence in target nets and bind to x-schemas
EXECUTE BOUND SCHEMAS (t)
Apply the Next Step Firing function from Chapter 3
BELIEF UPDATE (TARGET NET t)
Update the target network
BELIEF UPDATE (DISCOURSE NET t)
Update the discourse network
Both the above functions are from Chapter 7
End
First, the target domain Belief net TARGET NET is instantiated, which in-
cludes setting evidence for specic nodes. For example, the input f{struct that results from
a partial parse of the input sentence.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 208
Most lexical entries are fairly straightforward, in that they either set evidence for
specic nodes in the Belief net. For instance, the input above is parsed as
set_evidence(TARGET_NET, Domain= Ec. Policy, t = 1)
set_evidence(TARGET_NET, Ec. Policy=Liberalization, t = 1)
set_evidence(TARGET_NET, Actor=IG, t = 1)
Events are a bit more complicated, since they are more dynamic, bind to x-
schemas, or set the Agent state f{struct which may result and may set evidence to multiple
time steps.
(defevent (stumble, t)
(bind_to_schema ('#n(ongoing(walk), t-1))
(create_fs_marking('#n(bump), t)))
(defevent (slide, t)
(bind_to_schema('#n (ongoing(walk), t-1)))
(create_fs_marking('#n(slippery(gr), t))))
The function bind to schema marks the enable place of the schema in question
(here walk ) and sets the controller to a specic state (such as ongoing ). The function
create fs marking marks specic places of the Agent state f{struct with tokens. The 0 #n
macro translates names to the corresponding objects. The code above results in setting
the target domain Belief Net variable Domain to the state Economic Policy , the target
domain Belief Net variable Policy to the value Liberalization , and the target domain
Belief Net variable Actor to the value IG .
The event functions defevent(x) (in this case x = stumble ) encode the specic
method by which a linguistic item interacts with the x-schemas. For instance, the linguistic
input corresponding to stumble , when asserted at time t , binds to the walk x-schema at
time t ; 1 , and creates a token corresponding to a bump at time t .
At this point, UPDATE DISCOURSE NET (I 1) is called, resulting in the
set_evidence(DISCOURSE_NET, Ut-Type=Description, t = 1)
Now all the direct Belief net interventions from the linguistic input is complete,
and UPDATE MMAPS is called, where the metaphor maps are updated, as a result of
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 209
the new evidence entered. Here, mostly object maps become active and transfer bindings
from the appropriate target domain node to create a token in the source domain node. The
token is a tuple < bindname timestamp > , where bindname is the appropriate binding
and timestamp is the input utterance time ( 0 ; 4 ). These two steps are part of the
INCORPORATE INPUT algorithm shown below. A trace of this step of the algorithm
for the input in question is shown below.
ttyp3 tiramisu:~/thesis/code/interpreter> (load "istruct-stumble")
(Input ((Context EconomicPolicy)(Type Liberalization)(Agent IG)
(Event stumble)(Aspect Progressive)))
*****INTERPRET_INPUT(<#I223>)**
*******INCORPORATE_INPUT(<#I223>)***
*********UPDATE_BN(<#I223>, <TARGET_NET02>, t = 1 )***
***parsing <#I223>***
set_evidence(TARGET_NET, Domain=Economic Policy, t = 1)
******* done ****
set_evidence(TARGET_NET, Policy = Liberalization, t =1)
****done *****
set_evidence(TARGET_NET, Actor = IG, t= 1)
****done *****
*********done UPDATE_BN(<#I223>,<#TARGET_NET22>, t = 1))***
*********UPDATE_BN(<#I223>,
set_evidence(DISCOURSE_NET, Ut-Type=Description, t = 1)
*********done UPDATE_BN(<#I223>, <#TARGET_NET22>, t = 1)***
At this point the metaphor maps are updated. Mostly OMAPS get activated
since the target domain roles are mapped into source domain roles. In this case the new
evidence asserted activates the Actors ) Movers OMAP .
*********UPDATE_MMAPS(<#I223>, t = 1)***
******Checking enabled Maps********
(M_o3) Active with target node ``Actor= IG''
****** Executing OMAP Maps *****
Marking embodied node P44(P44.Name = Mover),
Token T112 #(<IG>, t1) generated
**No more active maps
*********done UPDATE_MMAPS(<#I223>..)***
The result is a token of type < IG t1 > being deposited in the f{struct places cor-
responding to the Mover of the spatial motion. We are now ready to run the UPDATE SCHEMA
function.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 210
*********UPDATE_SCHEMAS(<#I223>)***
**** parsing-event(Event stumble) ***
Binding to schema <#S12> (S12.Name = ``Walk'')
Bound E121 ``Enable input place''P44(P44.Name = T112)
Marking place P123(P123.Name =``Bump'')
Token T135 #<nil, t1> generated
***done
****parsing (Aspect Progressive)*****
Attempting to mark <# S12> controller <ongoing>
Mark <# S12>.controller <ongoing>
Token T234 #<S12.A148>, t0> generated
Marking <# S12>.controller <ongoing> with T234
****done
*********done UPDATE_SCHEMAS(<#I223>)***
*******INCORPORATE_INPUT(<#I223>)n***
At this point, we are done incorporating the new example. We have entered new
evidence to the Target Belief networks, and have successfully enabled the walk schema
with an agent that corresponds to the Indian Government (through the Mover ) Actor
OMAP . The interesting thing to note is that the token that gets deposited in the ongoing
node of the walk schema T 234 is timestamped t0 signifying prior context for the inter-
pretation. This comes from the lexical entry for the stumble event explained earlier. Thus
the situation at the end of this phase, is that the walk schema is active and ongoing and
a bump has been asserted (Figure 6.8).
Now we are ready to execute the bound schema walk in the presence of a bump.
This is exactly the situation shown in Chapter 6 (Figure 6.8).
The EXECUTE BOUND SCHEMAS function is shown below.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 211
The key variable to notice here is the binary valued function SMAP project? ,
which is called after every step of the x-schema simulation. This function returns true if
any new SMAP is activated at time t , and the inner loop is exited. Schema executions for
times ti < t are not sucient to get out of the loop (this is required since some lexical items
refer set the context as well as specify what the current event is). For example, the lexical
item stumble simultaneously sets the context to be an ongoing walk which encounters
a bump while taking a step. This was outlined in detail in Chapter 4, in Figure 6.8 -
Figure 6.11. Upon exiting the loop, the x-schema state is preserved, but since we have
asserted something new about the target inference propagation stops till the next input.
Thus the loop is exited as soon as there is some SMAP that can be activated from x-schema
executions. The problem of there not being any SMAP that can be activated was never
encountered, partly because embodied words usually refer directly to x-schemas that have
projections onto the target (such as fall or move, or remove-obstacle), but sometimes
execution of x-schemas may asynchronously trigger long inferential chains leading to many
possible target domain inferences. The function SMAP project? is meant to constrain
such long chains, since the loop is exited after one SMAP execution. A rationale for this
choice is outlined in Section 8.5.
Going on with our program trace, we are still in the INTERPRET INPUT
procedure, and are in the EXECUTE BOUND SCHEMAS routine.
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 212
**** EXECUTE_BOUND_SCHEMAS***
*******START LOOP
**** PERFORM_ONE_STEP ******
***** Checking Bound Schemas***
(<#S12> (S12.Name = ``Walk''))
Token E121 ``Enable input place''P44
Token T234 #<S12.A148>, t0> at <# S12>.controller <ongoing>
No more bound schemas
**** EXECUTE_SMAP_FUNCTIONS(ti < t) ******
((SMAP MOVE_399 active
(MOVE_399.name = MOVE=>IMPLEMENT POLICY)
(MOVE_399.status = ONGOING)
(TIME = 0))
****Executing SMAPS********
set_evidence (TARGET_NET, IMPLEMENT POLICY, t = 0)
set_evidence (TARGET_NET, POLICY STATUS = ONGOING, t = O)
*** done SMAPS (t =0)
At this point we have found one schema that was active at time t = 0 , hence
satises ti < t , and we are execute the SMAP which asserts an ongoing plan at time
t = 0 on by setting evidence on the t = 0 copy of TARGET NET . But we are still inside
the loop, and hence we execute bound schemas to advance the simulation to t = 1 , the
current time.
**** EXECUTE_BOUND_SCHEMAS***
*******START LOOP
**** PERFORM_ONE_STEP ******
Executing S12 (S12.Name = ``Walk'')
***** Checking Enabled Transitions**
Token T112 #(<IG>, t1) (Actor)
Token T135 #<nil, t1> at P123(P123.Name =``BUMP'')
Token T234 #<S12.A148>, t0> at <# S12>.controller <ongoing>
(S323 (S323.Name = ``Interrupt''))
*****Executing Next-Step Function *****
Firing SS323(S323.Name = ``Interrupt''))
Token T234 #<S12.A148>, t1> at <# S12>.controller <suspend>
Token T187 #<IG>, t1> generated
Token T187 #<IG>, t1> at P63(P63.name = ``UNSTEADY'')
The above trace corresponds exactly to the step function taking the system from
the state in Figure 6.8 to the state in Figure 6.9. At this point, we are ready to
EXECUTE PMAP FUNCTIONS .
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 213
While the PMAP projection asserts evidence of a diculty in the target domain
Belief net at time t = 1 , still no SMAP projection has occurred at time t = 1 , hence the
SMAP PROJECT ? function returns nil , and we repeat the loop again.
**** PERFORM_ONE_STEP ******
***** Checking Bound Schemas***
(<#T53> (T52.Name = ``Trip''))
Token T112 #(<IG>, t1) (Agent)
Token T004 #<nil, t1> at P86(P86.name = ``CONTROL(LOC)'')
Token T187 #<<IG, t1> at P63(P63.name = ``UNSTEADY'')
Token T008 #<nil, t1> at P63(P63.name = ``STABLE'')
***** Checking Enabled Transitions**
(<#T53> (T52.Name = ``Trip''))
*****Executing Next-Step Function *****
Consuming Token T004 #<nil, t1> at P86(P86.name = ``CONTROL(LOC)'')
Consuming Token T187 #<<IG, t1> at P63(P63.name = ``UNSTEADY'')
Consuming Token T008 #<nil, t1> at P99(P99.name = ``STABLE'')
Adding Token T187 #<<IG, t1> at P63(P63.name = ``UNSTEADY'')
Adding Token T303 #<IG, t1> at F331.A24(F331.A24.name = ``Fall'
(status = ready))
******** Goal-based Enabling detected ***********
****Instantaneous transition fired******
**** New bound schema
((S17 (S17.Name = ``Stabilize''))
***************
Resource = Energy (# 2)
Inh = stable
******************
>>>>>>>>>SMAP_PROJECT? Returned t
(SMAP FFAIL_121 active)(status = ready)
(SMAP STCO_133 active)(status = enabled)
(FFAIL_121.name = FALL=>OUTCOME=FAIL) Active(status = ready)
****************Exit Loop***************
****** DONE PERFORM_ONE_STEP
Continuing with the x-schema simulations, we now are in the situation depicted
in Figure 6.9 - Figure 6.10. The trip x-schema res, and the mover has lost control of his
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 214
location, and become unstable, which enables the x-schema stabilize , requiring energy
and resources insucient to transition to the ready stage. At this point we have two possible
x-schemas (fall and stabilize ) enabled (fall ready to go while stabilize just enabled),
and the one corresponding SMAP enabled namely Fall ) Fail . 6
At this point, we have completed the source domain inferences, and have found a
map from falling to failing, or ready-to-fall to ready-to-fail. At this point, the next step is
to EXECUTE SMAP FUNCTIONS and set new evidence in the target Belief Net.
*******EXECUTE_SMAP_FUNCTIONS
******* Timestamp t=1
*** Checking active SMAPS*********
((SMAP FFAIL_121 active
(FFAIL_121.name = FALL=>OUTCOME=FAIL)
(FFAIL_121.status = READY))
****************************
****Executing SMAPS********
set_evidence (TARGET_NET, OUTCOME = FAIL(0.6), t = 1)
********** done******************
*******DONE EXECUTE_SMAP_FUNCTIONS*********
*******DONE EXECUTE_BOUND_SCHEMAS*******
********BELIEF_UPDATE(TARGET_NET, 1) ****
*******checking t =0 evidence******
*******checking t =1 evidence******
******done BELIEF_UPDATE(TARGET_NET,1) ****
**************BELIEF_UPDATE(DISCOURSE_NET, 1) ****
*******checking t =0 evidence******
*******checking t =1 evidence******
************** done BELIEF_UPDATE(DISCOURSE_NET, 1) ****
ttyp3 tiramisu:~/thesis/code/interpreter>
At this point we are ready to perform belief updates on TARGET NET and
DISCOURSE NET using the update algorithms described in the last chapter. The re-
sultant belief state is thresholded (the current value is .6) and the output nodes and values
above threshold in shown in Table 8.1.
(return-threshold TARGET_NET, 1, .6)
6
Note there is another SMAP Stabilize ) Status(Pol) = resume but to keep the exposition simple
we omit this possibility. Note also that in general many x-schamas in the source domain such as trip will
not activate any SMAP and hence will only have impact on the processing if they lead to an SMAP
activation, as in this case with Fall ) Fail .
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 215
metaphorical use of words that share a subsumptive relationship in the target (called \core"
relations in (Martin 1990)) results in mapping to words with similar subsumptive relations
in the source domain. In addition, we believe that metaphor mappings are sensitive to
rather subtle semantic distinctions between concepts. Our results (ref. Chapter 9) sug-
gest the need for detailed semantic representation of source and target domains to properly
investigate the role of metaphoric mappings in interpretation.
In inuential early work on metaphor, (Carbonell 1982) looked at how metaphors
are able to invariantly map plans and strategies across domains. The work presented here is
consistent with Carbonell's observations that metaphors transfer planning knowledge across
domains. However, in contrast to Carbonell's work, which was concerned with modeling
long metaphoric inferential chains where entire plans and problem solving strategies are
transferred across domains, the data we looked at suggests fairly short source domain infer-
ential chains where key dynamic information (such as changes in rates, resources, dynamic
thwarting or enabling of goals) is transferred. Of course, one could potentially create long
source domain inferential chains by continually using a related set of metaphors from single
source domain, but this kind of sustained and consistent mapping was rare in the types of
narratives we looked at. In connection with Carbonell's observation that certain distinctions
are always invariantly projected across domains, our results (ref. Chapter 9) suggest that
aspectual distinctions are fall into this category and that aspect is an inherent characteristic
of events that is invariantly projected across domains.
Our system also accords with observations (Indurkhya 1992 Barnden et al. 1994)
that metaphoric projection often occurs from densely structured (familiar) domains to
sparsely structured (unfamiliar) domains. In our model, this simply translates to the fact
that familiar domains have many more x-schemas and inter-schema connections (more dy-
namic knowledge) that can be exploited by metaphors.
While our metaphor reasoning system is quite similar to other knowledge-based
approaches, there are a few additional features of our model that basically stem from pay-
ing greater attention to ne-grained dynamic characteristics of events and their role in
metaphoric projection.
1. First, our representation of actions and events with durations is more ne-grained
than other systems we are aware of. Specically, we believe our system to be novel
CHAPTER 8. A COMPUTATIONAL MODEL OF METAPHORIC REASONING 217
in being able to model temporal and and aspectual inferences across domains. As
we will see in Chapter 9, both these features are routinely exploited by newspaper
narratives of abstract events and policies.
2. Our use of a temporally extended Belief network to represent target domain knowl-
edge allows us to uniformly combine direct linguistic input and prior knowledge with
results of metaphoric projections in a single normative framework. Chapter 9 contains
examples where the system is able to generate dierent interpretations for the same
discourse segment when the interpreter has dierent priors or specic biases.
3. Projections about entities in the target can be made over multiple time steps (as in the
case of stumble). The inuence of metaphoric projections is able to assert information
about past states as well as bias future expectations in the temporally extended Belief
network. Chapter 9 shows how this enables us to interpret expressions such as back
on track.
4. We believe that important information about communicative intent and speaker eval-
uation of dierent situations or policies is often metaphorically communicated and
Chapter 9 shows some examples of our systems' capability of making these inferences.
We feel this avenue can be productively explored using the mechanisms in our system.
5. While most approaches require extra resources to process novel expressions (requiring
some form of metaphor extension), our approach is able to explain why some novel
expressions can be processed without additional cost through inferential projection
from the source domain (as in the case of the \stumble" example).
following observation consistent with Gricean maxims of relevance and with the observed
data.
Make as many source domain automatic inferences as required to infer a new
event about the domain of discourse. In our system, this observation translates
into a rather simple rule. Continue simulating the concrete domain x-schemas
until an SMAP is activated at the current time step, i.e. a new event projection
is made onto the target domain.
While the actual validity of the above claim is yet to be rigorously tested, we have
found this to be a intuitively satisfying assumption to work with. In our system one role
of metaphor maps is to actually constrain source domain inferences. In other words, new
SMAPS curtail propagation. It easily follows from our design that in cases where much
of the source is mapped directly onto the target there is no advantage to going back to the
source domain for inference, and reifying concepts in the target domain would be a good
idea. However, in cases with a rich source domain and relatively sparse target domain (as
SMAP
is the case with embodied metaphor maps) the the ratio EmbodiedConcepts 1 and thus
it makes sense to make source domain inferences. So from our hypothesis, it follows that
the sparser the target domain knowledge, the more likely are long source domain inferential
chains.
NIPS workshop on the topic). But exact mappings await further research.
A possible alternate implementation approach that is being explored may oer a
better chance for the reduction to a connectionist level. The idea is to use an x-schema
representation for the target domain as well. There is currently an implementation of
the system that replaces the target domain belief net with an x-schema based inference
system similar to the one described in Chapter 6. In this system, target domain theories
are modeled as x-schemas as are metaphor maps. The dierence between the two domain
theories is that the target domain is sparse while the source domain is densely connected.
The advantages are a uniform processing mechanism and translatability to a connectionist
implementation (ref. Chapter 4). The big disadvantage is that we don't yet know how to
compute the global impact of target domain f{struct updates (which is the beauty of the
Belief update algorithms).
221
Chapter 9
examples, we hope to give the reader an idea of the scope and range of information that
relies on the dynamic, compiled, and highly-responsive nature of x-schemas to provide cru-
cial inferences. Our results show the capability of our model to make these inferences in
real-time. While the results pertain to how information from embodied terms can be useful
in interpreting descriptions of events in the domain of international economics, we hope
that the generalization to abstract plans, goals, and intentions in any domain is obvious to
the reader.
Results are broken into two main sections. The rst section deals with results
of running our program on the development set, which were the set of stories that drove
development of the program. Once we were satised that our program to handle the type
of event descriptions present in development set, we decided to test the program on the test
set. Results on this test set can be found in the next Section.
perceptual and motor control parameters, but with PMAP projections, each of these
perceptual and motor features can also become important descriptive features of events
in abstract domains. For instance, in Chapter 7, we already saw the use of the PMAP
which transfered distance from the source of travel ( Dist(Src) ) onto the domain of plans
and policies as indicating the degree of completion of a plan ( Doc ). In our example, we
used a seven point distance scale and translated it to a probability distribution over a
Bayes net variable with three values. In general, these projections often scalarize target
feature-values with decreased target domain resolution.
Other examples of motion parameters being projected onto the abstract domain
of policies include the implemented PMAPS which project speeds and rates onto progress
made in plan implementation. For instance, proceeding at the desired speed setting in
the domain of spatial motion is projected as making progress on-schedule in the target
domain. Similarly high speed of motion is projected as making good progress (better than
expected) in plan implementation, and moving slowly is projected as make little progress
(less than expected) in plan implementation. Moving backward is similarly projected as
undoing progress, and dierent paths to a destination are projected as dierent plans
with the same goal .
In our examples, we were able to use PMAP s to map the following kinds of expres-
sions and modiers that corresponded to the x-schema parameters mentioned above. Size
parameters were routinely found in expressions like giant steps (Story 21)(ref. Table 9.1),
large step, small steps (Story 21), great leap forward (including the Chinese Economic Re-
form), and the interesting inches and feet not miles comment (Story 5). Rate of motion
was also a frequently used feature with expressions like slow progress (Story 5), slowed
down (Story 4), sprint (Story 7), jog (Story 7), and long, painful slide (story 6). Other
high-frequency lexicalized items coding for rate and manner included crawl, leap, trod, plod,
slog (Story 21), and lurch (story 10), slither (story 7), etc. Examples of distance related
parameters include expressions like almost there, long way to go, halfway there (Story 21),
and a little further (which were handled by the PMAPS dist(dest) ) AmountLeft , and
dist(src) ) DOC ). Force magnitudes and durations are also routinely projected as in grip
(Story 2),tear down (Story 4), hold back (Story 3). Specic counter-forces may cause injury
or harm to the traveler including robbing (Story 4). Force amounts are projected through
PMAPS linking them to the amount of inuence or intensity, determination or duration
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 225
of a specic plan or action. The area of force-dynamics and its projection onto abstract
domains is an ongoing area of research and current eorts and extensions are detailed in
the next section.
As far as we know ours is the only computational implementation capable of mod-
eling the transfer of such sensory-motor parameters from embodied to abstract domains.
Table 9.1 shows the program I/O for the frequently occurring phrase, Giant Steps (Story
21).
In the Giant Steps example, the input appears at time t = 1 , and the partial
interpretation (assumed) results in setting the feature values Domain = EconomicPolicy ,
Event = step and the step parameter step(size) = large . Note that all these values are
shown for time t = 1 .
The input marks the step x-schema (and the corresponding walk x-schema) to
be ongoing and the step-size parameter to be large ( step(size = large) ). The walk x-
schema then executes in a manner described in Chapters 6 with controller status
ongoing . This activates the SMAP described in Chapter 8 Move(forw) ) Impl(pol)
with Status = ongoing . The corresponding SMAP projects an ongoing step to set evidence
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 226
for an ongoing policy implementation. The size parameter also activates the PMAP
Mp (step:size = large ) progress = good) (better than expected) which is thus able to
assert that good progress is being made in the ongoing policy implementation. Thus the
f{struct in Table 9.1 shows these active metaphor maps described above.
The projections made by the metaphor maps onto the domain of economic poli-
cies is thus able to set evidence that at time t = 1 , an ongoing policy is making good
progress. During the subsequent BELIEF UPDATE , the persistence links propagate
this information (absent other information to the contrary) backwards and forwards in the
Bayesian network setting the context at the last time step and the future time step to be
about economic policies (hence the values t for this variable at times t = 0 and t = 2 in
Table 9.1). Persistence links carry the inference forward using the conditional probability
link P ((X t + 1)j(X t)) , and backward using Bayes rule. Recall that applying Bayes rule
allows to compute the posterior value of the variable of interest at time t ; 1 , given its
value at time t (as shown in Chapter 7 ). In all cases, we set the prior that progress was
good at time t = 0 to be :4 .( P (Progress = good 0) = 0:4 ), and the conditional link
P ((Progress = good 1)j(Progress = good 0)) = 0:8 . With these values and Bayes rule,
we get the values of the target variables (shown in boldface for t = 0 , and t = 2 ) in
Table 9.1. As a result of the operations described above, the system is able to infer that the
input giant steps, when uttered in the context of economic policies indicates better than
expected progress in the implementation of an ongoing policy.
include the focus on the consequent state that is signaled by the use of the perfect aspect
(Chapter 5 ) such as have robbed (story 4), has been lurching forward (story 10), has walked
(story 11),has sidestepped (story 14) has been now taken ill (story 11). We could also nicely
model several other high frequency aspectual expressions such as start to pullout (story 12),
on the verge of (story 10), still trying to climb out (story 15), and metaphoric expressions
of aspect such as set out (story 2), (ref. to Appendix B for the full I/O behavior) remain
stuck (story 12), on-track (ref. Table 9.2) and the interesting phrase back-on-track (Story
22) (Table 9.3). Another interesting case was the phrase stumble to the brink (Story 17),
where the controller graph distinction between the state of a schema being ready to ex-
ecute versus actually start to execute was crucial to make the right inferences. In summary,
almost every event description had a aspectual component, and so we believe attention to
the details of the semantics of Verbal Aspect is essential even to interpret the simplest of
event phrases and distinctions. We believe our model is unique in integrating the semantics
of aspect with metaphoric interpretation.
While we have seen many examples of how our model interprets aspectual ex-
pressions, consider the interesting metaphoric case of on-track (from Liberalization Policy
on-track). Table 9.2 shows the results of the metaphor reasoning system on this input.
Here, the event on ; track is coded as a traveler moving forward (toward destination)
at the specied (desired) rate (rate = setting ). A familiar SMAP projects moving
onto the domain of Economic Policies as implementing the policy, with the status as
ongoing (since motion is ongoing in the source). More interestingly note the PMAP
Mp (rate = setting ) Progress = on ; schedule) , which together allows for the interpre-
tation of \on-track" as an ongoing policy being implemented on ; schedule .
For an even more striking argument for the x-schema based architecture, consider
the semantics of Policy back on track. Table 9.3 shows the result of the system processing
this input. The interesting fact is that here one has to assert two previous states i.e. if
the input was at time t , the traveler was not moving forward at the desired rate (not
on-track) at the previous relevant stage ( t ; 1 ), but was moving forward at the desired rate
(on-track) at the stage previous to that one ( t ; 2 ). Projecting this onto the target, using
the maps described earlier, the system is able to infer that if the input is at time t , then at
a time immediately prior ( t ; 1 ), the policy was not on ; schedule , but at a time prior
to this ( t ; 2 ) the policy status was ongoing and the policy was on ; schedule . We
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 228
Table 9.2: Processing the Example on track. Relevant target variables with Posterior
Probabilities above Threshold ( :6 ) are shown.
FEATURE V (t0) V (t1 ) V (t1+ )
INPUT FEATURES
Domain=Economic Policy t t t
Event= on-track t
ACTIVE MAPS
Ms(move(forw) ) Implement ; policy) A A
Mp (rate = setting ) Progress = on ; schedule) A A
TARGET(OUTPUT) FEATURES
Action = Implement(Policy) t(.8) t t(.8)
Policy Status = Ongoing t(.7) t t(.8)
Outcome =success t(.7) t t(.9)
Progress = on-schedule t(.7) t t(.9)
believe that the dynamic semantics provided by x-schemas was absolutely crucial to be
able to successfully model this example.2 As a nal example of how Aspectual inferences
are absolutely crucial to interpret even such simple narratives, consider the phrase on the
verge of falling back into recession (Story 10), which nicely exercised the temporal aspect of
back discussed above, as well the controller distinction between enabled, ready, and start
required for on the verge of.
Table 9.3: Results of processing Back On Track. Note that there are at least three time
slices needed to model this utterance.
FEATURE V (t0 ) V (t1 ) V (t1+ ) V (t2)
INPUT FEATURES
Domain=Economic Policy t t t t
Event= back(on-track) t
ACTIVE MAPS
Ms(move(forw) ) Implement(policy )) A A
Mp (rate = setting ) Progress = on ; schedule) A A A A
TARGET(OUTPUT) FEATURES
Action= implement(policy) t f(.9) t t(.8)
Policy Status = ongoing t f(.9) t t(.8)
Outcome =success t(.8) f(.9) t t(.9)
Progress = on-schedule t(.8) f(.9) t t(.9)
strategy development and problem solving may well be common across dierent domains,
but they are not reex, real-time processes, whereas we found that statements about chang-
ing goals, resources and intentions often are available in real-time, especially when they are
asserted using embodied terms.
In Chapter 6, we saw how out x-schema executions can be enabled, disabled,
terminated, or otherwise modied by the assertions and retraction of goals and energy levels.
Moreover in Chapter 3, we showed how sensory-motor representations must inherently be
highly responsive to changing goal and resource needs for survival.
With embodied metaphor maps, narratives are able to exploit this feature of
sensory-motor representations to assert changing goals and resources. Consider the stum-
ble example (from Story 1) painfully detailed in Chapter 8. What stumble codes for is
the presence of a diculty interrupting a plan, leading perhaps to the thwarting of a pol-
icy goal. Similarly while we saw items coding for rate of progress or regress in the plan,
embodied verbs and event descriptions also compactly code the energy levels as a resource
For instance slog (BNC story 5), anemic (Story 10) (ref. Table 9.6), sluggish (Story 19) or
stagger to their feet, battered and bloodied(Story 16) crucially communicates the inherent
lack of stimulating conditions, resources, level of interest, attention or motivation. Similarly
tearing barriers (Story 4) or lightening burdens (Story 5) are able to assert conditions where
an impediment to goal achievement has now been removed. Compare this to the expres-
sion go around or sidestep where the diculty of potential failure is still present. Similarly
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 230
slippery slopes, slipperiest stones(Story 19), slide into recessions (Story 21), get projected
through SMAPS as the possible thwarting of goals due to unanticipated circumstances.
Falling is crucially important in this regard. In all the cases where a country was described
as falling into recession, we never saw a case in which the country's administration was
directly blamed as being able to control the downturn, a fact directly project-able from the
fact that falling is not controllable (an obvious and easy inference about fall). This is shown
in the example below.
For the output (thresholded at P 0:6 ) refer to Figure 9.1. Note especially the projection
from the uncontrollable and unintentional nature of falling onto the domain of Economic
states as the recession being cased by external factors beyond direct control or inuence
of the specic policy being pursued ( :control(loc) ) :control(Ec:State) ). Of course,
this kind of information is often useful to assign responsibility or blame to various agents
in the described scenario. For instance, with fall into recession, the speaker is inherently
indicating that he does not hold any specic administration or policy to blame (probably
a normal business cycle recession or some external causative factor). Contrast this to the
choice of walk into recession, where the speaker is quite likely assigning some responsibility
to a failed policy. In fact, quite often intentional aspects of embodied terms get transferred
onto more abstract domains as causative factors. While the explicit casual connection is
not modeled, even our prototype is able to model some issues of controllability as shown in
Figure 9.1. 3
Thus, while we certainly do not claim that our model is able to account for the rich
range and extent to which narratives are about goals changes and dynamic resource levels,
we note that in many cases, use of embodied terms and relying on metaphoric projection
3
In cases where the results are displayed as graphs, the x-axis represents the time step of interest ( 0 is
the prior, 1 - 3 are the the result of processing the input at that time step, and 4 is the predicted future
state of variable in question). In the case of a graph the shading of the relevant cell indicates the degree to
which the variable is believed to be true , the darker the shading the higher the degree of belief .
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 231
DOMAIN =EC.STATE
SCH: FALL
RECESSION
CONTROL(REC.)
Figure 9.1: Simple inferences are transfered from embodied to abstract domains for the
input \Brazil fell into recession". Of specic interest is the spatial inference that falling
into a hole results in being there this results in setting evidence for the target domain node
that Brazil is in recession at the end of the period specied by the input. Also interesting is
the inference that the recession was probably unanticipated and uncontrollable, an inference
from falling. No such inference is intended or available from processing Germany has walked
into recession (Story 11).
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 232
of the quick, compiled simulative reasoning products of x-schema execution may be the
best way to specify these complex and abstract notions of goals, resources, and controlling
and causative factors. In any case, we fail to see how a semantics that is not inherently
dynamic could account for this range of phenomena.
Table 9.4: The Metaphor Reasoning System's response to the input reorient policy. Em-
bodied terms often convey information about monitoring and control of complex plans.
FEATURE V (t0) V (t1) V (t1+ )
INPUT FEATURES
Domain=Economic Policy t t t
Event=reorient t
ACTIVE MAPS
Ms(check(dist to goal) ) monitor(DOC ) A
Ms(check(rate) ) monitor(progress) A
Ms(adjust(motion) ) adjust(policy) A
Mp (rate = slow) ) (progress = low) A
TARGET (OUTPUT) FEATURES
Monitor(DOC) t(.8) t t(.8)
Monitor(progress) t(.8) t t(.8)
Adjust(policy) t
Status(new policy) = READY t t(.3)
Status(new policy) = START f t(.7)
is ready (a result of reorienting) and will most likely be implemented soon (shown as a
predicted start at the next time step (the next temporal slice of the target network)).
readiness to remove such a future diculty. The complete I/O for this story can be found
in Appendix B.4
Table 9.5: System's response to Story 3 (ref. Appendix B) showing showing the impact of
bold in the phrase boldly set out. For full I/O behavior on this story, please see .
Feature V (t0 ) V (t1) V (t2) V (t3) V (t3+)
Mp : obstacle ) difficulty A A
Ms5 : deal ; with ; obstacle ) remove ; diff A A A
Remove-di(status) = READY t t t t
Goal = FT ^ Dereg t t t
Diculty t t(.7)
Example 17
1. Government loosened strangle-hold on business.
2. Government deregulated business.
Both sentences communicate the same fact in the domain of economics, namely the
the situation corresponding to business deregulation. Our implemented system is able to
conclude that only in the rst case does the speaker intend to communicate the threatening
nature of Government intervention, its highly systemic and deleterious consequences, as well
as his evaluation of the inherent cruelty of high regulation, something very aptly encoded
4
Another case (the last example with reorient was yet another) where linguistic devices are able to exploit
the controller distinctions between ready and start , a ne-grained control distinction that is useful for
motor control but proving quite indispensable for language.
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 235
in Table 9.6. The concrete domain meaning of anemic in this example was set to be a low
energy recovery ( energy = low )5 The SMAP Ms (Recover ) Change state(neg pos))
maps the process of recovery onto the domain of economics as the change of state from a
negative to a positive value. Similarly the PMAP s Mp(energy = low ) :stable) and
Mp (energy = low ) growth rate = low) map the non-energetic nature of the recovery
to assert an unstable economy with a low growth rate . With these maps and the input
shown, we get the target domain f{struct shown in the Table 9.6. In the example, we note
that the instability of the economy keeps the predicted future values of the Economic state
variable unchanged from the current one, showing unpredictability of the current situation.6
The system was also tested with the following novel example.
multiple plans
choose(plan)
Figure 9.2: Novel expressions may use conventional mappings allowing the interpreting
agent to correctly infer the intended meaning in contexts where it has never been encoun-
tered before. Here the concrete domain meaning of crossroads (multiple paths) maps to the
abstract domain of economic policy through the event structure metaphor.
(multiple possible paths) and the event structure metaphor mapping conceptual features
from the domain of motions to abstract actions allows for a reasonable interpretation.
The relevant results of processing this example is shown in Figure 9.2. Crucially,
given the stored maps ( Paths ) Plans and the Choose(path) ) Choose(Plan) ) concrete
domain meaning of crossroads (multiple paths to choose from) the system is able to conclude:
Thus the interpreting agent correctly infers the intended meaning of a source do-
main concept in contexts where it has never been encountered before. Here the concrete
domain meaning of crossroads (multiple paths) maps to the abstract domain of economic
policy through the event structure metaphor. Note that the source domain feature remain
activated until a metaphoric inference can be made. The target markings remain persis-
tently active.
Other examples of novel expressions correctly interpreted by our program include
roadblocks (barrier to progress was correctly interpreted as diculty through Obstacles )
Difficulty ) (story 3), anemic recovery (Story 10) (ref. to Table 9.6), lurching forward
(story 10), long, painful slide (story 7), and treading on toes (story 7), and the beautiful
stumble over rocky relationship (Story 17).
DOMAIN =EC.STATE
DOMAIN=EC. POLICY
ECONOMY=>PERSON
SIZE->GDP
SIZE->EC. POWER
EC.-STATE = POS
SCH=FALL(SICK)
SICK
COUNTRY=GERMANY
COUNTRY=FRANCE
GOAL = EC.STATE(POS)
T=0 T=1 T=2 T = 2++
Figure 9.3: Processing the input European Giant falls sick. Note that the OMAP Eco-
nomic Actors ) People coupled with the PMAP that projects size onto GDP in the
Economic context, results in asserting evidence in the target that identies Germany as
the country with the largest GDP and thus as the referent of the subject of the input
sentence. Note also that France has some posterior probability of being the referent as well.
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 240
Stocks
down
V:Q
less
healthy
H:Q
more
Figure 9.4: Multiple source domains can be serially used as long as they are coherent in
the target domain. Here the rst input Stocks down activates the Less IS Down metaphor,
while the second input Healthy again activates the More IS Healthy metaphor.
Example 21 World Bank prescribed Structural Adjustment Program (SAP) bleeding In-
dian Economy. (ref. # Report of Int. Womens Conf. Beijing, 1995) (Story 2)
Prescription , and the SMAP Cure ) Policy making the World Bank appear as a
doctor and the Indian Economy as a patient, and the SAP program as a treatment . The
result is one of mistaken therapy shown in Figure 9.5. A well-intentioned cure is seen as
failing (hence the outcome variable is failure ). The expectation is that the good doctor
(World Bank) would remedy the situation upon receiving this information and discontinue
the policy (this is not modeled).
DOMAIN =EC.POLICY
POLICY-TYPE=SAP
POL. MAKER= WB
MAKER->DOCTOR
EC.-STATE = NEG.
ECONOMY ->PATIENT
OUTCOME =FAIL
SP-EVAL(WB)= NEUTRAL
SP-EVAL(SAP)=MISTAKE
EXIST(IEC)= FALSE
T=0 T=1 T=2 T = 2++
Figure 9.5: The results of interpreting the input corresponding to the sentence \World Bank
SAP bleeding Indian Economy". Here, the prior belief of the interpreter activates the cure
x-schema. Here, the target domain inferences is one of mistaken therapy, where the cure
doesn't work. Also note, that in the domain of health and well being, the bleeding activated
slow dying. Note that the future continuation of the policy is uncertain as is the fate of the
Indian economy.
Policy . This favors viewing World Bank as the harmful doctor rather than as a competent
doctor. This triggers the bad ; medicine x-schema as opposed to the cure x-schema. So in
this case, World Bank is the harmful doctor, Indian Economy is the bleeding patient , and
the SAP prescription is a harsh and harmful prescription of the World Bank. The results
of this setting can be found in Figure 9.6. Compare this to the earlier case of evaluating
the policy as a genuine mistake and benign-ness of the WorldBank .
DOMAIN =EC.POLICY
POLICY-TYPE=SAP
POL. MAKER=WB
MAKER ->HARMER
EC.-STATE = NEG.
ECONOMY ->PATIENT
SCH=BAD-MED(WB,SAP,IEC)
OUTCOME =SUCCESS
SP-EVAL(WB)=NEG++
SP-EVAL(SAP) = CRUEL
EXIST(IEC)=FALSE
T=0 T=1 T=2 T = 2++
Figure 9.6: Processing the input \World Bank SAP bleeding Indian Economy" with a
dierent prior. Here, the prior belief of the interpreter activates the bad-medicine schema.
Here, the target domain inferences is one of harmful neglect, with bad medicine. Also
note, that in the domain of health and well being, the bleeding enabled dying, which
translates to the increased possibility extremely deleterious consequences (non-existence) to
the Indian Economy. Also note that the Policy Maker is viewed as a cruel blood-letting
harmfuldoctor and the policy is viewed as invasive .
In the third case, we set the Sp ; eval(WB ) variable to be pos + + with respect
to World Bank (signifying prior knowledge that the speaker was positive about World
Bank intervention) and this does makes the World Bank appear as a doctor and the Indian
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 243
Economy as a patient, and the SAP program as a treatment . The result is one where
partial blood letting is part of the curing process leading to a partial cure shown in Figure 9.7.
Notice the absence of any malicious intent being ascribed to the World Bank in this case,
and the evaluation of the policy as required and also the existence of the Indian Economy
as never threatened by the policy.
DOMAIN =EC.POLICY
POLICY-TYPE=SAP
POL. MAKER= WB
MAKER->DOCTOR
EC.-STATE = NEG.
ECONOMY ->PATIENT
OUTCOME =SUCCESS
SP-EVAL(WB)=POS++
SP-EVAL(SAP) = REQUIRED
EXIST(IEC)= FALSE
T=0 T=1 T=2 T = 2++
Figure 9.7: The results of interpreting the input corresponding to the sentence \World Bank
SAP bleeding Indian Economy". Here, the prior belief of the interpreter activates the cure
x-schema. Here, the target domain inferences is one of partial cure, where the cure does
work and is required. Note that in this case, the speaker expects the outcome of the curing
to be successful, the cure to be ongoing, and no indication of deleterious consequences for
the Indian Economy, unlike the other cases.
One crucial dierence in the three cases is in the rst case, the the outcome of
a mistakenpolicy is asserted as unsucessful , in the second case the outcome of a cruel
and uncaring policy is asserted as successful , in the third case an ongoing policy is
asserted as partially successful leading to future success .
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 244
Thus as the three cases, above show we are able to model how changes in prior
evaluation of a situation can be used to compute what the meaning of an utterance is.
Crucially, as in most of these cases, the dierence seems to be in which source domain
schema gets invoked, and the resulting inferences. Of course, in all cases described above,
the Ut ; type variable is changed from description to opinion or propaganda (cases a
and c ) suggesting that this sentence is highly judgemental and belongs in an editorial or
syndicated column as opposed to the reported news.
It is somewhat interesting that even our simplistic model is able to detect these
rather subtle dierences in speaker intent and communicative goals. We believe the choice
of the embodied x-schema is often a compact and ecient way to encode such information.
Conversely, the unconscious choice by a speaker of an embodied term can give the hearer
signicant clues as to the prior belief and intent of the speaker, obviously something that
needs far more exploration.
the lexicon (such as the name of a new actor in the domain of international policies). All
such additions are documented in the synopsis as under the title Additions which appears
immediately below the story in question. For cases where no modication was necessary,
this entry is left blank. In the body of the brief synopsis, we attempt to briey describe
the inferences obtained through metaphoric projection of the embodied terms. Whenever
there was a signicant diculty due to the syntactic or semantic structure of the narrative
in question, we describe any input simplications made. Following the synopsis, just as
in the case of the development database, we show the complete I/O behavior for a few
representative stories (stories 1 , 2 , and 3 ). The complete I/O behavior for these stories
can be found in the Appendix (ref. Chapter 13).
Some test stories pointed to specic deciencies in the implementation, while re-
maining compatible with the overall model. These mostly included new aspects of source
domains or new kinds of metaphors that are possible extensions to the system described
here. Section 9.2 describes some of these required extensions in greater detail.
1. Story:
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 246
Additions: Added that Zaire was a country and the Actor was the Government of
Zaire. Also target variables in power(gov ) .
Inferences: Full I/O behavior is shown in Table 13.1 (ref. Appendix C ). The main
inferences here are aspectual on the brink suggests that the Government is about
to fail ( Fall ) Fail , Status = READY ) but has not yet failed. Our controller
abstraction and our hypothesis of aspectual inferences being invariantly projected was
once again supported by this example.
2. Story:
Protection is a mistaken therapy of prescribing palliatives to the economy in
response to painful change and the perception of injury.(CNN transcripts,
1995).
Additions:
We translated the enabling condition from the perception of injury to asserting injury
since our model has no way to account for beliefs about other people's beliefs etc. The
palliative medicine x-schema already existed, so the rest of the example was relatively
easy to process. Note that in response to palliative medicine, the true symptoms
remain while the pain may be alleviated for a short while. 11
Inferences: Figure 13.1(Chapter 11) shows the output for this test example. Here
the inferences were mainly from the health and well being domain, where the x-
schema corresponding to the mistaken therapy, and prescription of palliatives was
able to conclude that while the economy looked good temporarily, the underlying
symptoms continued to exist after the palliative was prescribed, and that the net
eect was detrimental in the long run, including the speakers negative evaluation of
Protectionist policies.
3. Story:
10
Thanks to Collin Baker.
11
The input to the system was the f{struct (Input (Context EconomicPolicy)(Policy Protection)
(Event Palliative(Prot, EC)) (ENABLE(Pain, Event1)) (ENABLE(Perception(Injury)), Event1) (Ut-type
Description))
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 247
The U.S. economy continues to struggle along showing fresh signs of weak-
ness almost daily.(BNC Corpus)
Additions:
Inferences: Table 13.2 (Chapter 13) shows the full I/O behavior for this example.
The basic inferences include the aspectual one of ongoing weakness and struggle, and
the current and anticipation of diculty.
4. Story:
HEADLINE: THE U.S. ECONOMY BEGINS TO CRAWL FROM THE
MIRE (National Review, 3/21/94).
Additions:
Inferences: Here the inferences were mainly aspectual, allowing the system to assert
a slow economic recovery whose status was ongoing , and one that had a low degree
of completion ( DOC = low+ ) at the current time. Also, interestingly, begin allows
the system to make the inference that at the last time step ( t = 0 ), the US Economy
was ready to emerge from recession, but the Economic State was still negative.
5. Story:
Economy moving along at the pace of a Clinton jog. (WSJ, May 2, 1997)
Additions: Yes, we were forced to put a value to the pace of a Clinton jog (was rate
= 2 ) (on a seven point scale).
Inferences: Once we were presumptuous enough to code the slow rate of a Clinton
jog, the rest of the inferential process was easy for the system, asserting an economy
making very slow progress. Of course the inferences were completely humor-less, a
deciency we suspect will not be addressed in the foreseeable future.
6. Story:
HEADLINE: Indian Government \turns it back" on Protectionism (The
New York Times, July 11, 1993)
Additions:
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 248
Inferences: Turning your back involves facing the other way, resulting in a re-
versal of direction of motion. This changing direction corresponds to changing the
policy in question. This is done in our system by activating the familiar SMAP
Change direction ) Change policy . In the target domain we have the knowl-
edge that Protectionism in an inward oriented policy. Changing the policy from
being protectionist to non-protectionist changes this variable from inward oriented
to outward oriented . This, in turn changes the posterior distribution of policy
variable at the current time step to allow all the outward oriented values (liber-
alization, Export-push, etc.) to have more of the probability mass. ( P (policy =
Lib: 0)j(outward orientation 0) P (policy = Lib: 0)j(inward orientation 0) ).
Of course, the economic knowledge that Liberalization and Protectionism are linked
through inward and outward orientation. Hence the economy now becomes outward
oriented, and all outward oriented policies become active.
7. Story:
HEADLINE: BAD MEDICINE. Nafta took some time swallowing and has
failed to cure the Economic ills of Mexico. (LA Times, May 1996)
Addition: Had to add NAFTA as an Economic Policy variable (Policy Maker US),
aecting Mexico.
Inferences: Interestingly the headline named the appropriate x-schema directly! In-
ferences that the policy implementation took time and had diculty in passing, and
that the current outcome of the policy is one of failure.
8. Story:
HEADLINE: Protectionism Is Our Most Important Product The United
States plunged down the slippery slope of free trade. (The New York Times,
July 11, 1993)
Additions:
Inferences: The system was able to infer that the US Government was implementing
a Liberalization Policy which in the speaker's evaluation was highly negative ( Sp ;
Eval(Pol) = neg + + ), and that the Economic State of the country was changing
rapidly to a highly negative state (from plunging down at SMAP move(down) )
CHAPTER 9. RESULTS OF THE METAPHOR REASONING SYSTEM 249
Additions: The system currently has no way to interpret the modal should, so the
input given was that Sp ; Eval(US ) was positive in the case of leap , and negative
in the case of crawl . This resulted in two inputs one corresponding to US leaps with
Sp ; Eval positive and the other to crawl with Sp ; eval negative.
Inferences Generated: suggested that in the opinion of Speaker (Walesa), US
should progress substantially in ( progress = good ) in implementing its ongoing pol-
icy. rather than the East European policy. ( implement(policy ) , Status = ongoing ,
progress(t0) = slow ).
10. Story:
HEADLINE: New EU Members Cool Heels As Union Debates Voting BY-
LINE: Howard LaFranchi, Sta writer of The Christian Science Monitor.
After negotiating to enter the EU, Austria, Finland, Norway, and Sweden
have arrived on the doorstep only to nd that those who asked them in are
embroiled in a quarrel over the conditions they have to settle before they
can open the door.
Now to the guests' further dismay, the arguing has degenerated into a "cri-
sis" grave enough to leave the four standing on the doorstep for some time
to come. In fact, the whole issue would have been put o until then if the
EU were not forced to make a decision by the knocking at the door of the
rst four.
more abstract domains such as economics and politics. This allows non-experts to compre-
hend and reason about such abstract policies and actions in terms of more universal and
commonplace concepts. Our results also provide evidence for our hypothesis that familiar
and essential domain of spatial motion is encoded as highly accessible compiled knowledge
required both for action monitoring and failure recovery but also used for fast, parallel,
real-time reex inference in interpretation. While we believe that our results conrm that
we have taken a step in the right direction, there are several challenges to the system as is
stands. Some of these issues are discussed in Chapter 10.
251
Chapter 10
Conclusion
This thesis was an exercise in theoretical cognitive science, where we attempted
to demonstrate through computer modeling the plausibility that the semantics of motion
and manipulation phrases are grounded in the ne-grained, active representations. We
tried to provide evidence for this hypothesis by showing that that the dynamicity and
reactivity provided by such representations is often exploited by the use of motion terms
and expressions in discourse about abstract plans and processes.
Specically, we took a small but interesting segment of ordinary discourse, namely
the use of familiar motion and manipulation terms in complex event and policy descrip-
tions. We built a computational model that demonstrates how the high degree of context-
sensitivity inherent in the representation is routinely utilized to specify control, monitoring,
resource and goal-based information about complex plans and processes that operate in
an uncertain and dynamic environment. Our model shows the capability of making these
discourse inferences in real-time, consistent with the fact that such information is available
as reex, automatic inference in narrative understanding.
While these initial results are encouraging, full validation awaits demonstration
of the approach on a wider range of language problems. Particularly pressing are issues
pertaining to integration with parsing, modeling image-schemas and image-schema trans-
formations, and the ability to learn and extend metaphoric mappings. We are making
progress on some of these issues and a summary of ongoing eorts is described below.
CHAPTER 10. CONCLUSION 252
10.6 Epilogue
This thesis is a small step along a long path that attempts to construct theories
of language rooted in perception, action and in the computational properties of the brain.
While there is obviously a long way to go, and the terrain ahead is uncertain with many
stumbling blocks, I defer any questions about the direction to Alan Turing's intuitions on
the matter (thanks to Jerry Feldman for the quote).
CHAPTER 10. CONCLUSION 258
Of ( : : : many) possible elds the learning of languages would be the most im-
pressive, since it is the most human of these activities. This eld seems however
to depend rather too much on sense organs and locomotion to be feasible.
|- Alan Turing, 1948.
259
Chapter 11
the industrial giant, to get Government back within its means and lighten
our punitive tax burden. (Reagan 1984 acceptance speech)
6. Japan began its long, painful slide towards recession.
7. While others sprint or jog to the information super-highway, Europe is slith-
ering. Governments must step out of the way, or intervene by removing
obstacles not impede progress by attempting to build it. (Economist, 1995)
8. There is a price to pay for changing course but it is less than the price of
plunging ahead carelessly.(Michael Mendelbaum /PBSMcNeil Lehrer Dec.
11, 96).
9. HEADLINE: Vienna stumbles in waltz to join European Union
AUSTRIA, once expected to waltz smoothly into the European Union, is el-
bowing its partners, treading on toes and pogo-dancing in a most un-Viennese
manner.
10. HEADLINE: U.S. ECONOMY MAY TIP BACK INTO RECESSION
The U.S. economy may be on the verge of falling back into recession after
more than a year of half-hearted recovery. It has been lurching forward but
the anemic recovery.. (press association newsle, Dec. 17, 1992)
11. HEADLINE: news analysis: european economic giant falls sick
BYLINE: by xia zhimian
DATELINE: bonn, december 17 ITEM NO: 1217071
After 10 years of steady ourishing, the european economic giant has now
been taken ill. Some say that germany has walked into recession, and others
argue that it has not yet entered that unwelcome state.
12. ... major continental European economies start to pull out of recession,
academic economists claimed today. However Germany will remain stuck
in recession leading to a "two speed" recovery in Europe.
13. France has not yet signed the treaty (NPT), but at least they are taking
a cautious step in the right direction toward world peace, safety and san-
ity.(Houston Chronicle, 4/17/92)
14. Buoyed by hefty doses of foreign investment and driven by reform-minded
governments, Asia has sidestepped the crippling debt crisis that cost Latin
America and Africa at least a decade of lost development. (by Ramon
Isberto DATELINE: MANILA, Sept. 3,1993)
15. The defense buildup, along with the deep tax cuts in '81, helped dig the
scal hole that the United States is still trying to climb out of - LA Times
Editorial(Aug. 19, 1993).
CHAPTER 11. APPENDIX A: DATABASES AND PROGRAM PARAMETERS 261
16. As the United States and the UK stagger to their feet, continental Eu-
rope and Japan are still battered and bloodied by recession, forcing tens of
thousands of redundancies, layos and production ... (Copyright 1993 In-
formation Access Company Copyright National Review Inc. 1993 National
Review)
17. HEADLINE: ONCE MORE U.S. STUMBLES TO THE BRINK WITH
NORTH KOREA (C. Krauthammer) HEADLINE: UNITED STATES, UNITED
NATIONS STUMBLE OVER ROCKY RELATIONSHIP
18. HEADLINE: "Reluctant Giant": Germany "Balks" At "Leading" Euro-
peans To Victory.
SECOND HEADLINE: An Economic "Colossus" Seems Politically "Impo-
tent"
Germany's shoulders, it turns out are not very broad.
19. Economic reform in China is like "Crossing a river by feeling for the stones"
(described by an ocial) Among the rm footings the economy has found,
the slipperiest stone was ination. (nyt 02/08/93)
20. Flinching over Washington's Hand in state wallets (H) Government's habit
of striking its long arms into the state's wallets bound to squeeze state and
local budgets and only intensify the strains that have developed between
Washington and the local governments.(nyt02/08/93)
21. The following expressions were all found in a single sunday issue of the New York
Times (02/08/93).
Shoulder (responsibilities, burden)
Toe the line
(Cautious, small, large)steps
Measured steps
Economy floundering
US turning its back (on Bosnia)
US reorients policy
Uphold
Upright image
Veered in different directions
Close ties
Blind spots
Aging party
Far closer to the goal
Laid the groundwork
Stable and as nurturing Government
Speed up the negotiations
Slow down the talks
CHAPTER 11. APPENDIX A: DATABASES AND PROGRAM PARAMETERS 262
Fleeting second
Stand-off
Moved toward (recovery, adjournment, recess)
UN would support steps to ...
Like-minded countries
Political will
Political muscle
Germany throws its weight
Cripple food production
Washington is determined to press ahead
..Proposing force to prevent STRANGULATION of Sarajevo
Difficulties lie ahead
Economy continues to slog along
slash, cut.
Grasp x-schema
grab, grip, pinch, flinch, squeeze
dig(hole) x-schema
Move x-schema
slow down
Change(dir)
turn(around)
==========Energy=============================
Replenish Stimulate Drain
============ Monitoring=======================
detect_obstacle(obstacle)
check(dist) check(rate)Monitor(dir)
Test_footing Test_ground Monitor_posture
=============Obstacle Avoidance =============
Deal_with_obstacle Approach_obstacle(rate, dist)
Apply_force
Smash
Go_around
Jump over
=========Force Dynamics=====================
Fight/resist(counterforce or injurer)
Shed/remove/lighten(burden)
Apply_force(dir), Apply_pressure(dir)
Remove_obstacle Impose_obstacle
Reduce_obstacle Injure Strangle (injure x-schema)
Strangle
Loosen Strangle-hold (injure x-schema)
The other set of schemas pertain to the Health and Well Being Domain. The
implemented set of entities are presented after the general schema is presented.
Example 23
In the presence of symptoms, including weakness or pain or varying magnitude
and duration, the patient may take part in the following four independent scenarios.
1. Patient goes to a competent doctor who diagnoses the illness, and prescribes a cure.
Upon taking the prescribed medicine, the patient is cured and the symptoms disap-
pear, leaving the patient healthy as before.
CHAPTER 11. APPENDIX A: DATABASES AND PROGRAM PARAMETERS 265
2. Patient goes to an ineective doctor who prescribed palliatives which may kill the
pain for a short while but leave the patient no healthier than before. This ine
ective
doctor scenario may lead to the neglect scenario.
3. Patient goes to a quack who may prescribe medicines and therapies that are potentially
dangerous and may increase the illness rather than cure it. This situation will be
referred to as the quack scenario.
4. Patient does nothing or neglects the illness, which may leads to increased diseased
condition.
PARAMETERS:
Duration = low, normal, high(default = normal)]
Magnitude =(1 - 7) default = 3]
Expense = (1 -5) default = 3]
Pain = (1 -7) default = 3]
DOMAIN OBJECTS/ROLES:
Doctor type = Good Doctor, Surgeon, Incompetent Doctor, Quack,
Harmer]
Patient type = ideal, problem]
Illness type = wound, systemic, cancer, psychological]
Medicine type = mild, placebo, restorative, invasive]
Therapy type = drugs, surgery]
DOMAIN SCHEMAS:
Cure, Diagnose Prescribe
Correct diagnosis (cure x-schema)
Efficient cure (cure x-schema)
Partial cure (cure x-schema)
Mistaken cure
Mistaken Diagnosis (mistaken cure x-schema)
Mistaken Therapy (mistaken cure x-schema)
Harm
Cause to Bleed (harm x-schema)
Neglect
CHAPTER 11. APPENDIX A: DATABASES AND PROGRAM PARAMETERS 266
The DISCOURSE NET priors are shown below. Note that in some cases, (ref.
Chapter 9) we changed the priors to see show how background evaluatory information could
aect the processing of the input.
Ut-Type = description(.5), assertion (.3), opinion (.1), propaganda(.1)]
Domain = Ec-Policy (.3), Ec-State(.3), Health(.2), Embodied(.2)]
Sp-Eval(WB) = neg+ (.1), neg (.2), neutral (.3), pos(.3), pos+(.1)]
Sp-Eval(IG) = neg+ (.1), neg+ (.3), neutral (.2), pos(.3), pos+(.1)]
Sp-Eval(France) = neg+ (.1), neg+ (.3), neutral (.2), pos(.3), pos+(.1)]
Sp-Eval(US_GOV) = neg+ (.1), neg (.1), neutral (.3), pos(.3), pos+(.2)]
Sp-Eval(US_BUS) = neg+(.01), neg(.1), neutral (.2), pos(.39), pos+(.3)
Sp-Eval(Other_BUS) = neg+(.01), neg(.1), neutral (.2), pos(.39), pos+(.3)
Sp-Eval(Other_Players) = neg+(.1), neg(.2), neutral (.3), pos(.2), pos+(.2)
Sp-Eval(Liberalization) = neg+(.01), neg(.1), neutral (.2),
pos(.39), pos+(.3)]
Sp-Eval(Protectionist) =neg+(.3), neg(.39), neutral (.2),
pos(.1), pos+(.01)]
CHAPTER 11. APPENDIX A: DATABASES AND PROGRAM PARAMETERS 267
Chapter 12
Table 12.7: Input F-struct for \WB bleeds" story (story 22) in database
Feature V (t0 ) V (t1 ) V (t2) V (t3 ) V (t3+)
Domain=Int. Lending t t t
Ut-Type = Desc t t t
Event=force(WB, DEV, SAP, make-payments) t
Event=make-payments(DEV, WB, size=high++) t
Event=bleed(DEV, MONEY) t
Event =hemorrhage(DEV) t
CHAPTER 12. APPENDIX B: I/O BEHAVIOUR ON SELECTED STORIES 273
Table 12.9: Input F-struct for \WB and Lib" story (Story 3) in database
Feature V (t1) V (t2 ) V (t3) V (t3+ )
Event1 = pressure(World Bank, India) t
Event = respond(Ind, Event1 , Lib) t
Attitude(IG) = bold t
Event = set-out(India, Liberalization) t
Event= loosen-stranglehold(IG, Bus) t
Domain = Economic Policy t t t t
Ut-Type = description t t t t
CHAPTER 12. APPENDIX B: I/O BEHAVIOUR ON SELECTED STORIES 274
Chapter 13
DOMAIN =EC.POL
INJURED
PAIN
PALLIATIVE(PROT, EC)
DISEASE
POL = PROT
GOAL = GROWTH
GOAL(PROT) = EMP
STATUS(PROT) =ONG
OUT(PROT) = FAIL
SP-EVAL(PROT) = NEG
Figure 13.1: An example of the health-well being source domain and its projection onto the
domain of plans. Corresponds to Story 2 in the Test Results section. Note that the disease
remains after the palliative medicine is applied, hence the Negative economic state persists
after the protectionist policy is in force.
CHAPTER 13. APPENDIX C: SPECIFIC TEST RESULTS IN DETAIL 277
Table 13.2: System Behavior For US Economy Struggles (story 3 ) in test results
Feature V (t0) V (t1 ) V (t2) V (t2+ )
INPUT FEATURES
Event = CONTINUE(struggle) t t t t(.7)
Event = inc.(weakness) t t
Domain = Economic State t t t t
Actor = U.S.Economy t t t t
OUTPUT FEATURES
Ec.State neg neg neg+ recession(.8)
Sp-Eval(Pol) = neg+ t(.7) t(.7) t(.8) t
278
Bibliography
Agre, P., & Chapman, D. (1987). Pengi: An implementation of a theory of activity. In
Proc. AAAI-87 , 268{272, Seattle, WA.
Connell, J. (1989). A Colony of Robots. Ph.D. Thesis, MIT, AI-Lab, 1989.
Arbib, M.A., (1992). Schema Theory. In the Encyclopedia Of Articial Intelligence, 2nd.
Edition, edited by Stuart Shapiro, 2:1427-1443, Wiley, 1992.
Arbib, M.A. (1994). The Metaphorical Brain: 2. Wiley Interscience, 1994.
Arkin, R.C. (1990). Integrating Behavioral, Perceptual, and World Knowledge in Reactive
Navigation. Robotics and Autonomous Systems 6 (1990) 105-122.
Bailey, D. (1997). A Computational Model of Embodiment in the Acquisition of Action
Verbs. Ph.D. Thesis, University of California Berkeley (to appear).
||, Feldman, J.A., Narayanan, S., & Lako, G. (1997). Modelling Embodied Lexical
Development. Proc. 19th Conference of the Cognitive Science Society , 1997.
Ballard, D.H., Hayhoe, M.M., Pook, P.K., & Rao, R.P.N. (1997) Diectic codes for the
embodiment of cognition. Behavioral and Brain Sciences, in press.
Barnden, J. Helmreich, L. Iverson, E. Stein, G.C. (1994). An integrated implementation
of simulative, uncertain, and metaphorical reasoning about mental states. Prinici-
ples of KR and Reasoning: Proceedings of the Fourth International Conference (Bonn,
Germany, 24-27 May 1994), San Mateo, CA: Morgan Kaufmann.
|| and Holyoak, K. eds. (1994). Advances in Connectionist and Neural Computation
Theory. Volumes 1-3, Ablex Publishing Corp. New Jersey ISBN: 1-56750-101-X.
BIBLIOGRAPHY 279
Barrett & Weld (1994) Partial-order Planning: Evaluating Possible Eciency Gains. Arti-
cial Intelligence, 67: 71-112, 1994. Lawrence Earlbaum, 1982.
Bennett, P.A. et al, 1986. Multilingual Aspects Of Information Technology. Gower, Brook-
eld, VT, 1986.
Berlin, B., and Kay, P. (1969). Basic Color Terms: Their Universality and Evolution .
Berkeley, CA: University of California Press.
Berman, R., and Slobin, D., 1994. Relating Events in Narrative: A Crosslinguistic Devel-
opmental Study. LEA Press, 1994.
Bernstein, N. A., (1967). The Co-ordination and Regulation of Movement . New York:
Pergamon Press.
Bhagwati, J. (1987). Rethinking Trade Strategy. Development Strategy Reconsidered, Trans-
action Books, N.J.
Blum A. and Furst M. (1995). Fast Planning through Planning Graph Analysis. Proc. of
IJCAI95, 1636-1642, Montreal 1995.
Borchardt, G.C. (1994). Thinking Between The Lines: Computers and Comprehension of
Causal Descriptions. MIT Press 1994.
Bradford, C. (1991). Policy Interventions and Markets: Development Strategy Typologies
and Policy Options. Manufacturing Miracles. Princeton University Press, 1991. pp.
32-51.
Brooks, R. A. (1986). A robust layered control system for a mobile robot. IEEE Journal
of Robotics and Automation, 2:14-23.
Bullock, D. (1995). Motorneuron Recruitment. Handbook Of Brain Theory, MIT Press
594-97.
Cacciari, C. and Tabossi, P. (1988). The comprehension of idioms. Journal of Memory and
Language, 27, 668{683.
Carbonell, J. (1982) Metaphor Comprehension. Strategies for Natural Language Processing,
413{433, Lawrence Earlbaum, 1982.
BIBLIOGRAPHY 280
Feldman J. A., & Ballard, L. (1982). Connectionist Models and Their Properties. Cognitive
Science No. 6, 205-254 (1982)
|| & Shastri, L. (1984). Evidential Inference in Activation Networks. Proc. 6th Conference
of the Cognitive Science Society , 1984. pp. 156-160.
|| and Waltz, D., eds., (1988) Connectionist Models and Their Applications. Ablex
Publishing Company, 1988.
|| 1989. Neural Representation Of Conceptual Knowledge. Neural Connections, Mental
Computation. MIT Press 1989.
||, Lako G., Bailey D.A., Narayanan S., Regier T., & Stolcke, A. (1996). L0 |the rst
ve years of an automated language acquisition project. AI Review 8.
Feldman, J. L. (1986). Neurophysiology of respiration in mammals. Handbook of Neuro-
physiology: Section 1, The Nervous System. ed. F. Bloom, 4:463-524, Bethesda, Md:
Am. Phisiol. Soc.
Bernstein, N. A. (1967). The Co-ordination and Regulation of Movement . New York:
Pergamon Press.
Fillmore, C. J, (1988). The mechanisms of \Construction Grammar". In Proceedings of
BLS 14, pages 35{55, Berkeley, CA.
Fikes, R. & Nilsson (1971). STRIPS: A new approach of theorem proving to problem
solving, Articial Intelligence, No. 2 189-208.
Firby, J. (1989). Adaptive Execution in Complex Dynamic Worlds. Reasearch Report No.
672, Yale University.
Fung R. & Chang, K.C. (1990). Weighting and integrating evidence for stochastic simulation
in Bayes networks. Proc. UAI 5, 209-219, Elsevier, Amsterdam, 1990.
Gallese, V., Fadiga V., Fogassi L., & Rizzolatti G. (1996). Action recognition in the premotor
cortex. Brain 119(2), 593{609.
Gelfond, M. & Lifschitz, V. (1993). Representing Action and Change by Logic Programs.
Journal of Logic Programming, 17:301-322, 1993.
BIBLIOGRAPHY 282
Grafton, Scott T., Arbib, M. A., Fadiga L., and Rizzolatti G. (1996). Localization of
grasp representations in humans by PET: 2. Observation compared with imagination.
Experimental Brain Research (112), 103{111.
Gibbs, R. Jr. (1994). The Poetics Of Mind. Cambridge University Press, 1994.
Gary Gare, 1991. Paths Of Industrialization: An overview. Manufacturing Miracles:
Princeton University Press, 1991. pp. 32-51.
Girard, J. (1987). Linear Logic. Theoretical Computer Science 50 (1987) 1- 102.
Goldberg, A. (1995). Constructions. UC Press.
Goldszmidt, M. & Pearl, J. (1992). Rank-based systems: A simple approach to belief
revision, belief update and reasoning about evidence and actions. Proc. Of the Third
Conference on Principles of KR and Reasoning, 1992: 661-672, Morgan Kaufman, Inc.
1992.
Grady J. (1996). A Compositional Theory of Metaphor. Presented at CSDL-II, Bualo
NY. Proceedings (to appear) CSLI/Cambridge University Press, 1998.
Grosz, B. and Sidner C. (1986). Attentions, Intentions, and the Structure of Discourse. In
Computational Linguistics, 12(3), pp. 175-204 .
Head, H. (1920). Studies in Neurology. Hodder and Stroughton, London, 1920.
Henderson, J. (1994). Connectionist Syntactic Parsing Using Temporal Variable Binding.
Journal of Psycholinguistic Research, 23(5):353-379.
Hobbs, J. R. and Moore, B. (1985). Formal Theories Of The Commonsense World. Ablex
Publishing Corporation, NJ.
|| and Bear, J. (1990). Two principles of parse preference. In Proceedings of the 13th
International Conference on Computational Linguistics (COLING-90), pages 162{167,
Helsinki.
Hummel, J.E. & Biederman, I. (1992). Dynamic binding in a neural network for shape
recognition. Psychological Review 99:480-517.
BIBLIOGRAPHY 283
Hwang, C. H. & Schubert, L. (1994). Interpreting Tense, Aspect, and Time Adverbials: A
Compositional, Unied Approach. Proceedings of the First International Conference
on Temporal Logic (ICTL 94), July 1994, Bonn. Germany, 1-27.
Indurkhya, B. (1992). Metaphor and Cognition. Kluwer Academic Publishers.
Jeannerod, M. (1986). The formation of nger grip during prehension. A cortically mediated
visuo-motor pattern. Beh. Brain Res., 19:99-116.
Jelinek, F. and Laerty, J. D. (1991). Computation of the probability of initial substring
generation by stochastic context-free grammars. Computational Linguistics, 17, 315{
323.
Jensen, F. (1996). An Introduction to Bayesian Networks. Springer-Verlag ISBN 0-387-
91502-8.
Johnson, M. (1987). The Body In The Mind: The Bodily Basis of Meaning, Imagination,
and Reason. University Of Chicago Press, ISBN 0-226-40318-1.
Jurafsky, D. (1996). A probabilistic model of lexical and syntactic access and disambigua-
tion. Cognitive Science 20, 137{194.
Kay, P. and McDaniel C. K.(1978). The linguistic signicance of the meaning of basic color
terms. Language 54(3), 610{646.
Kawamoto, A. H. (1993). Nonlinear dynamics in the resolution of lexical ambiguity. Journal
of Memory and Language, 32, 474{516.
Kjaerul, U. (1992). A computational scheme for reasoning in dynamic probabilistic net-
works. Proc. UAI-92, 121-129,Stanford.
Kolodner, J. (1984). Conceptual Memory: A Computational Perspective. Lawrence Earl-
baum, Hillsdale, NJ.
Langacker, R. (1987). Foundations of Cognitive Grammar I: Theoretical Prerequisites. Stan-
ford University Press, Stanford.
Lange, T. and Dyer, M. (1989) High-level Inferencing in a Connectionist Network. Con-
nection Science, 1 (2), pgs. 181-217, 1989.
BIBLIOGRAPHY 284
Lako, G. and Johnson, M. (1980). Metaphors we live by. University Of Chicago Press.
|| (1987). Women, Fire, and Dangerous Things: What Categories Reveal about the Mind .
University of Chicago Press.
|| (1994). What is Metaphor?. Advances in Connectionist Theory. V3 : Analogical
Connections, V3,1994.
Lauritzen, S. and Spiegelhalter, D. (1988). Local computations with probabilities on graph-
ical structures and their application to expert systems. J. Royal Statistical Society B,
50:127{224.
Lifschitz, V. (1990). Frames in the Space of Situations. Articial Intelligence 46:365-376,
1990.
Luce, P. A., Pisoni, D. B., and Goldnger, S. D. (1990). Similarity neighborhoods of spoken
words. In G. T. M. Altmann (Ed.), Cognitive Models of Speech Processing. MIT Press,
Cambridge, MA.
MacDonald, M. C. (1993). The interaction of lexical and syntactic ambiguity. Journal of
Memory and Language, 32, 692{715.
MacDonald, M. C., Pearlmutter, N. J., and Seidenberg, M. S. (1994). Syntactic ambiguity
resolution as lexical ambiguity resolution. In Perspectives on Sentence Processing.
Erlbaum, Hillsdale, NJ.
Marcus, M. P., Santorini, B., and Marcinkiewicz, M. A. (1993). Building a large annotated
corpus of english: The Penn treebank. Computational Linguistics, 19, 313{330.
Marslen-Wilson, W. (1990). Activation, competition, and frequency in lexical access. In
G. T. M. Altmann (Ed.), Cognitive Models of Speech Processing. MIT Press, Cam-
bridge, MA.
Marslen-Wilson, W., Brown, C. M., and Tyler, L. K. (1988). Lexical representations in
spoken language comprehension. Language and Cognitive Processes, 3, 1{16.
Martin, J. (1990). A Computational Model of Metaphor Interpretation. Academic Press,
NY, 1990.
BIBLIOGRAPHY 285
Masseron, M. and Tollu, C., and Vauzeilles, J (1990). Generating Plans in Linear Logic.
Foundations of Software Technology and Theoretical Computer Science, vol 472 of
LNCS, p 63-70, Springer 1990.
McCarthy, J., Hayes, P. (1969). Some Philosophical Problems From the Standpoint of
Articial Intelligence. Machine Intelligence 4 (1969) 463-502.
McCawley, J. D. (1971). Tense and time reference in English Studies in Linguistic Seman-
tics, New York: Holt, Reinhart and Winston, pp. 96-113.
McClelland, J. L., St. John, M., and Taraban, R. (1989). Sentence comprehension: A
parallel distributed processing approach. Language and Cognitive Processes, 4, 123{
154.
Mcdermott, D. (1982). A Temporal Logic For Reasoning About Processes and Plans. In
Cognitive Science 6: pp. 101-155 .
Miller, G. (1990). Wordnet: An on-line lexical database. International Journal of Lexicaog-
raphy, 3(4)(Special Issue).
Moens, M. & Steedman, M. (1988). Temporal Ontology and Temporal Reference. In Proc.
ACL{88, V4, Number 2, June 1988, pp. 15-29 .
Molloy, M.K., et al, (1982). Performance Analysis Using Stochastic Petri Nets. IEEE
Transactions on Computers, C-31, No.9, pp.913-917 .
Murata, T. (1989). Petri Nets: Properties, Analysis, and Applications. In Proc. IEEE{89,
V77, Number 4, April 1989, pp. 541-576 .
|| (1991). Planning with Petri Nets. In Information and Control, 1991.
Nakhimovsky, A. (1988). Aspect, Aspectual Class, and the Temporal Structure Of Narra-
tive. In Proc. ACL{88, V4, Number 2, June 1988, pp. 29-44 .
Narayanan, S. (1996). Embodiment in Language Understanding: Modeling the Semantics
of Causal Narratives. AAAI Symposium on Embodied Cognition and Action, AAAI
Press TR:FS-96-02.
BIBLIOGRAPHY 286
|| (1997). Walking the Walk is Like Talking The Talk: A Computational Model of
Verbal Aspect. Proceedings of the Nineteenth Annual Conference on Cognitive Science
Society (to appear) 1997. A longer and more linguistically oriented paper was Presented
at CSDL-II, Bualo, NY. Proceedings (to appear) CSLI/Cambridge University Press,
1998.
|| & Jurafsky, D. (1997). Exploiting Conditional Independence in Language Understand-
ing ICSI TR-97-14, July 1997. Also submitted to Computational Linguistics.
|| & Thielscher, M. (1997). A connectionist model for dynamic resource-based logics.
Working Paper. Available from http://www.icsi.berkeley.edu/ snarayan.
Nilsson, N. J. (1994). Teleo-reactive programs for agent control. Journal of Articial
Intelligence Research 1, 139{158.
Norman, D.A. & Shalice, T (1980). Attention to Action: willed and automatic behavior.
Human Information Processing, Tech. Report No. 99, UC San Diego.
Norvig, P. (1989). Marker Passing as a Weak Method for Text Inferencing. Cognitive
Science No. 113: 569-620.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference. Morgan Kaufman, San Mateo, Ca.
|| (1994). A probabilistic calculus of actions. Proc. UAI-94, 454-462,Stanford.
Pearson K.G. (1993). Common Principles of Motor Control in Vertebrates and Inverte-
brates. Ann. Review Of Neuroscience, 1993, 16:265-97.
Portinale, L. (1994). A Petri net Model of Abduction. Ph.D. Dissertation, University of
Torino, 1994.
Pollard, C. and Sag, I. A. (1994). Head-Driven Phrase Structure Grammar. University of
Chicago Press, Chicago.
Reichenbach, H. (1947). Elements Of Symbolic Logic. Berkeley, CA, University Of California
Press.
Reisig, W. (1985). Petri Nets. Springer Verlag.
BIBLIOGRAPHY 287
Regier, T. (1996). The Human Semantic Potential: Spatial Language and Constrained
Connectionism. Cambridge, MA: MIT Press.
Resnik, P. (1992). Probabilistic tree-adjoining grammar as a framework for statistical natu-
ral language processing. In Proceedings of the 14th International Conference on Com-
putational Linguistics, pages 418{424, Nantes, France.
Resnik, P. (1993). Selection and Information: A Class-Based Approach to Lexical Rela-
tionships. PhD thesis, University of Pennsylvania. (Institute for Research in Cognitive
Science report IRCS-93-42).
Rieger C. (1975). Language Comprehension. Readings in KR, Morgan-Kaufman, 1975.
Rosenschein, J.S. (1985). Formal theories of knowledge in AI and Robotics. New Generation
Computing 3(4):345-357.
Sacerdoti, Earl D. (1975). The nonlinear nature of plans. In Proc. IJCAI{75 , 206{214.
Sag, I. A., Kaplan, R., Karttunen, L., Kay, M., Pollard, C., Shieber, S., and Zaenen, A.
(1985). Unication and grammatical theory. In Proceedings of the Fifth West Coast
Conference on Formal Linguistics.
Salasoo, A. and Pisoni, D. B. (1985). Interaction of knowledge sources in spoken word
identication. Journal of Memory and Language, 24, 210{231.
Schank, R.C. & and Abelson, R.P. 1977. Scripts, Plans, Goals, and Understanding: An
inquiry into human knowledge structures. Hillsdale, NJ:Erlbaum 1977.
Schmidt, R.A. (1975). A Schema Theory of Discrete Motor Learning. Psychol. Rev., 82,
225-60.
Shastri, L. & Ajjanagadde, V. (1993). From simple associations to systematic reasoning.
Behavioral and Brain Sciences, 16:3, 417-494.
|| & Grannes, D.J. (1996). A Connectionist Treatment of Negation and Inconsistency.
Proc. 18th Conference of the Cognitive Science Society, 1996. pp. 142-147.
||, Grannes, D.J. Narayanan, S. and Feldman.J A. (1997). Connectionist parameterized
routines. Submitted to Neural Information Processing Systems 10. Also presented as
a poster at the 19th Cognitive Science Society Conference.
BIBLIOGRAPHY 288
Simpson, G. B. and Burgess, C. (1985). Activation and selection processes in the recognition
of ambiguous words. Journal of Experimental Psychology: Human Perception and
Performance, 11, 28{39.
Singer, W. (1993). Synchronization of cortical activity and its putative role in information
processing and learning. Annual Review of Physiology 55: 349-74
Siskind, Jerey Mark, (1995). A computational study of lexical acquisition. Cognition 50,
1{33.
Srinivas, S. and Breese, J.S. (1990). IDEAL: A software package for the analysis of belief
networks. In Proceedings of Sixth Workshop on Uncertainty in AI, Cambridge, Mass.
1990.
Steedman, M. (1995). Dynamic Semantics for Tense and Aspect. Proc. IJCAI 1995, pp.
1292-98 1995.
|| (1996). Temporality. Situation Calculus and its Applications CSLI Press, Stanford
1996.
Sternberg, S. Monsell, R. Knoll, R.L. & Wright C.E. (1978). The latency and duration
of rapid movement sequences: comparisons of speech and typewriting, Information
Processing in Motor Control and Learning, G. E. Stelmach, Academic , New York:
117-152, 1978.
Stolcke, A. (1995). An ecient probabilistic context-free parsing algorithm that computes
prex probabilities. Computational Linguistics, 21, 165{202.
Sweetser, E. (1990). From Etymology to Pragmatics: The mind-as-body metaphor in se-
mantic structure and semantic change. Cambridge, UK: Cambridge University Press,
1990.
Swinney, D. A. and Cutler, A. (1979). The access and processing of idiomatic expressions.
Journal of Verbal Learning and Verbal Behavior, 18, 523{534.
Talmy, L. (1985). Lexicalization Patterns, Semantic Structures in Lexical Form. in Language
Typology And Symbolic Description, v3: Cambridge University Press, 1985.
BIBLIOGRAPHY 289
|| (1987) Force Dynamics in Language. Tech Report. Institute For Cognitive Science,
UC Berkeley, 1987.
Tanenhaus, M. K. and Lucas, M. M. (1987). Context eects in lexical processing. Cognition,
25, 213{234.
Tanji J. and Shima S. (1994) The supplementary motor area in the cerebral cortex. Nature,
vol. 371, issue 6496, (SEP 29, 1994) : pp. 413-416.
Thielscher, M. (1995). The Logic of Dynamic Systems. Proceedings of IJCAI, Vol2, pp.
1956-62 1995.
Tollu, M and Masseron, M. K. (1993). Deductive Planning and Linear Logic. Theoretical
Computer Science, 234-243, 1993.
Trueswell, J. C. and Tanenhaus, M. K. (1991). Tense, temporal context and syntactic
ambiguity resolution. Language and Cognitive Processes, 6, 303{338.
Trueswell, J. C. and Tanenhaus, M. K. (1994). Toward a lexicalist framework for constraint-
based syntactic ambiguity resolution. In Perspectives on Sentence Processing. Erlbaum,
Hillsdale, NJ.
Trueswell, J. C., Tanenhaus, M.K., and Garnsey, S. M. (1994). Semantic inuences on
parsing: Use of thematic role information in parsing. Journal of Memory and Language,
1194, 285{317.
Tyler, L. K. (1984). The structure of the initial cohort: Evidence from gating. Perception
and Psychophysics, 36, 417{427.
Vendler, Z. (1967). Linguistics in Philosophy. Cornell University Press, Ithica, New York.
Weber, S. (1989) Figurative Adjective-Noun Interpretation in a Structured Connectionist
Network. Proceedings of the 11th Annual Meeting of the Cognitive Science Society, pgs.
204-211, 1989.
Webber, B. (1988). Tense as Discourse Anaphor. In Proc. ACL{88, V4, Number 2, June
1988, pp. 61-74.
Wilensky, R. (1983). Planning and Natural Language Understanding. Addison Wesley, 1983.
BIBLIOGRAPHY 290