An Introduction To Multi Agent Systems
An Introduction To Multi Agent Systems
gent
Svste
M I C H A E L W O O L D R I D G E
An Introduction to
Multiagent Systems
An Introduction to
Multiagent Systems
Michael Wooldridge
Department o f Computer Science,
University of Liverpool, Uk'
l i b r a r y o f Congress Cataloging-in-PublicationData
Wooldridgc, h1ichar.l ,I., I9F(i-
!In ~ n t r o d u c t ~ oton multiiigent systcms / bfichael Wooldridgc.
p. cm.
lncludcs bibliographical rc.fercnces a n d index.
ISBN 0-37 1 -3!)61) 1 -X
I . Intclligc!nt agents (C'c~mputersoftware) I . 'Title.
Preface
1 Introduction
1.1 The Vision Thng
1.2 Some Views of the Field
1.3 Objections to Multiagent Systems
2 Intelligent Agents
Environments
Intelligent Agents
Agents and Objects
Agents and Expert Systems
Agents as Intentional Systems
Abstract Archtectures for Intelligent Agents
How to Tell an Agent What to Do
Synthesizing Agents
6 Multiagent Interactions
6.1 Utilities and Preferences
viii Contents
7 Reaching Agreements
7.1 Mechanism Design
7.2 Auctions
7.3 Negotiation
7.3.1 Task-oriented domains
7.3.2 Worth-oriented domains
7.4 Argumentation
8 Communication
8.1 Speech Acts
8.1.1 Austin
8.1.2 Searle
8.1.3 The plan-based theory of speech acts
8.1.4 Speech acts as rational action
8.2 Agent Communication Languages
8.2.1 KIF
8.2.2 KQML
8.2.3 The FIPA agent communication languages
8.3 Ontologies for Agent Communication
8.4 Coordination Languages
9 Working Together
9.1 Cooperative Distributed Problem Solving
9.2 Task Sharing and Result Sharing
9.2.1 Task sharing in the Contract Net
9.3 Result Sharing
9.4 Combining Task and Result Sharing
9.5 Handling Inconsistency
9.6 Coordination
9.6.1 Coordination through partial global planning
9.6.2 Coordination through joint intentions
9.6.3 Coordination by mutual modelling
9.6.4 Coordination by norms and social laws
9.7 Mu1tiagent Planning and Synchronization
10 Methodologies
10.1 When is an Agent-Based Solution Appropriate?
10.2 Agent-Oriented Analysis and Design Techniques
10.3 Pitfalls of Agent Development
10.4 Mobile Agents
1 1 Applications
11.1 Agents for Workflow and Business Process Management
11.2 Agents for Distributed Sensing
11.3 Agents for Information Retrieval and Management
11.4 Agents for Electronic Commerce
Contents ix
Afterword
References
Index
Preface
visions of where i t is going. The second part - Chapters 2-5 inclusive - are con-
cerned with individual agents. Following an introduction to the concept of agents,
their environments, and the various ways in which we might tell agents what to
do, I describe and contrast the main techniques that have been proposed in the
literature for building agents. Thus I discuss agents that decide what to do via
logical deduction, agents in which decision making resembles the process of prac-
tical reasoning in humans, agents that do not explicitly reason at all, and, finally,
agents that make decisions by combining deductive and other decision-making
mechanisms. In the third part of the book - Chapters 6-10 inclusive - I focus on
collections of agents. Following a discussion on the various ways in which multi-
agent encounters and interactions can be classified, I discuss the ways in which
self-interested agents can reach agreements, communicate with one another, and
work together. I also discuss some of the main approaches proposed for designing
multiagent systems. The fourth and final part of the book presents two advanced
supplemental chapters, on applications of agent systems, and formal methods
for reasoning about agent systems, respectively.
I have assumed that the main audience for the book will be undergraduate
students of computer science/IT - the book should be suitable for such students
in their second or third year of study. However, I also hope that the book will be
accessible to computing/IT professionals, who wish to know more about some of
the ideas driving one of the major areas of research and development activity in
computing today.
Chapter structure
Every chapter of the book ends with three sections, which 1 hope will be of wider
interest.
A 'class reading' suggestion, which lists one or two key articles from the
research literature that may be suitable for class reading in seminar-based
courses.
A 'notes and further reading' section, which provides additional technical
comments on the chapter and extensive pointers into the literature for
advanced reading. This section is aimed at those who wish to gain a deeper,
research-level understanding of the material.
An 'exercises' section, which might form the basis of homework to be set for
students. Exercises are graded on a scale of one to four, with one being tlie
easiest (a few minutes work), and four being the hardest (research projects).
Exercises of difficulty three might be undertaken as projects over some
nJeeks or months; exercises of level one or two should be feasible within
a few hours at most, and might be undertaken as part of weekly homework
or tutorials. Some exercises are suggested for class discussion.
here. When deciding what to put in/leave out, I have been guided to a great extent
by what the 'mainstream' multiagent systems literature regards as important, as
evidenced by the volume of published papers on the subject. The second consid-
eration was what might reasonably be (i) taught and (ii) understood in the context
of a typical one-semester university course. This largely excluded most abstract
theoretical material, which will probably make most students happy - if not their
teachers.
I deliberately chose to omit some material as follows.
Learning. My view is that learning is an important agent capability, but is not cen-
tral to agency. After some agonizing, I therefore decided not to cover learning.
There are plenty of references to learning algorithms and techniques: see, for
example, Kaelbling (1993), WeiB (1993, 1997), Wei& and Sen (1996) and Stone
(2000).
Artificial life. Some sections of this book (in Chapter 5 particularly) are closely
related to work carried out in the artificial life, or 'alife' community. However,
the work of the alife community is carried out largely independently of that in
the 'mainstream' multiagent systems community. By and large, the two commu-
nities do not interact with one another. For these reasons, I have chosen not to
focus on alife in this book. (Of course, this should not be interpreted as in any
way impugning the work of the alife community: it just is not what this book is
about.) There are many easily available references to alife on the Web. A useful
starting point is Langton (1989);another good reference is Mitchell (1996).
Mobility. There is something of a schism in the agents community between those
that do mobility and those who do not - I mostly belong to the second group.
Like learning, I believe mobility is an important agent capability, which is par-
ticularly valuable for some applications. But, like learning, I do not view it to be
central to the multiagent systems curriculum. In fact, I do touch on mobilit)~,in
Chapter 10 - but only relatively briefly: the interested reader will find plenty of
references in this chapter.
Markov decision problems. Markov decision problems (MDPs), together with
their close relatives partially observable MDPs, are now the subject of much
attention in the A1 community, as they seem to provide a promising approach
to the problem of making decisions under uncertainty. As we will see in much
of the remainder of this book, this is a fundamental problem in the agent agent
community also. To give a detailed introduction to MDPs, however, would be
out of the question in a textbook on agents. See Blythe (1999) for pointers into
the literature, and Kaelbling eta!. (1998) for a detailed technical overview of the
area and issues; Russell and Norvig (1995, pp. 398-522) give an o~wviecvin the
context of an A1 textbook.
In my opinion, the most important thing for students to understand are (i) the
'big picture' of multiagent systems (why it is important, where it came from, M-hat
mi Preface
the issues are, and where it is going), and (ii) what the key tools, techniques, and
principles are. Students who understand these two things should be well equipped
to make sense of the deeper research literature if they choose to.
Web references
It would be very hard to write a book about Web-related issues without giving
UKLs as references. In many cases, the best possible reference to a subject is
a Web site, and given the speed with which the computing field evolves, many
important topics are only documented in the 'conventional' literature very late
in the day. But citing Web pages as authorities can create big problems for the
reader. Companies go bust, sites go dead, people move, research projects finish,
and when these things happen, Web references become useless. For these reasons,
I have therefore attempted to keep Web references to a minimum. I have preferred
to cite the 'conventional' (i.e. printed), literature over Web pages when given a
choice. In addition, I have tried to cite only Web pages that are likely to be stable
and supported for the foreseeable future. The date associated with a Web page is
the date at which I checked the reference was working. Many useful Web links are
available from the book's Web page, listed earlier.
Acknowledgments
Several people gave invaluable feedback on the 'history of the field' section. In
particular, Les Gasser and Victor Lesser were extremely helpful in sorting out my
Preface xvii
muddled view of the earl), days of distributed AI, and .JeffKosenschein gave a lot
of help in understanding how- game-theoretic techniques entered the multiagent
systems literature. Keith Decker gave suggestions about material to cokreron bro-
kers and middle agents. Michael Fisher helped with examples to illustrate his
Concurrent Metatelll language in Chapter 3. Valentina Tamma set me straight on
ontologies and DAML. Karen Mosman from Wiley was (and indeed is) an unspeak-
ably cheerful, enthusiastic, and charming editor, and 1 suppose I should grudg-
ingly admit that I ver). much enjoyed working with her. Simon Parsons and Peter
McRurne): were enormously helpful with the section on argumentation. Nick Jen-
nings, as eLrer, gal7e encouragement, support, and sensible practical ad1,ice on
contents and style.
Marie Devlin, Shaheen Fatima, Marc-Philippe Huget, Peter McBurney, Carmen
Pardavila, and Valentina Tamma read drafts of the book and gakTedetailed, help-
ful comments. Marie sa\~edme man): hours and much tedium by checking and
crosschecking the bibliography for me. I hate books with sloppy or incomplete
references, and so Marie's help was particularly appreciated. We both made exten-
sive use of the CITESEER autonomous citation system from NEC (see NEC, 2001 ),
which, as well as helping to provide the definitive reference for man)! obscure
articles, also helped to obtain the actual text in many instances. Despite all this
help, many typos and more serious errors will surely remain, and these are of'
course mj7responsibilit);.
I hak~etaught parts of this book in various guises at various locatiuns since
1995. The comments and feedback from students and other participants at thcstl
Lrenueshas helped me to improve it significantly. So, thanks here to those at the
1996 German Spring School on A1 (KIFS) in Gunne am MOhnesee, the AgentLink
summer schools in lJtrecht (1999), Saarbriicken ( 2 0 0 0 ) , and f'rague ( 2 0 0 l), the
ESSLLI course on agent theory in Saarbrucken (1998), tutorial participants at
ICMAS in San Francisco (1995) and Paris (l998), tutorial participants at ECAI in
Budapest ( I99Ci), Brighton (19%), and Berlin (2000),and AGENTS in Minneapolis
(1998),Seattle (1999),Barcelona (2000),and Montreal (2001),as well as students in
courses on agents that I have taught at Lausannc ( 1 999),Barcelona (2000),Helsinki
(1999 and 2001), and Liverpool (2001). Boi Faltings in Lausanne, lilises Cortes
and Carles Sierra in Barcelona, and Heimo Lammanen and Kimmo Raatikainen in
Helsinki were all helpful and generous hosts during mj7visits to their respecti\.c
institutions.
As ever, my heartfelt thanks go out to my colleagues and friends in the multi-
agent systems research community, who have made academia rewarding and
enjoyable. You know who you are! Deserving of a special mention here are Carles
Sierra and Carme: their luridness and hospitality has been astonishing.
I took over as Head of Department while I was completing this book, which
nearly killed both the book and the Department stone dead. Fortunatel) - or not,
depending on )+ourpoint of view - Katrina Houghton fought hard to keep the
lJniversitjr at ba)-, and thus bought me enough time to complete the job. For this
xviii Preface
I am more grateful than she could imagine. Paul Leng was a beacon of common
sense and good advice as 1 took over being Head of Department, without which I
would have been even more clueless about the job than I am now.
A network of friends have helped me keep my feet firmly on the ground through-
out the writing of this book, but more generally throughout my career. Special
thanks here to Dave, Janet, Pangus, Addy, Josh, Ant, Emma, Greggy, Helen, Patrick,
Bisto, Emma, Ellie, Mogsie, Houst, and the rest of the Herefordians.
My family have always been there, and writing this book has been made much
easier for that. My parents, Jean and John Wooldridge, have always supported me
in my career. Brothers Andrew and Christopher have done what all good brothers
do: mercilessly tease at every opportunity, while simultaneously malung their love
abundantly clear. Janine, as ever, has been my world.
Finally, I hope everyone in the Department office will accept this finished book
as definitive proof that when I said I was 'workmg at home', I really was. Well,
sometimes at least.
Mike Wooldridge
Liverpool
Autumn 2001
Introduction
The history of computing to date has been marked by five important, and contin-
uing, trends:
ubiquity;
interconnection;
intelligence;
delegation; and
human-orientation.
By ubiquity, I simply mean that the continual reduction in cost of computing
capability has made it possible to introduce processing power into places and
devices that would hitherto have been uneconomic, and perhaps even unimagin-
able. This trend will inevitably continue, making processing capability, and hence
intelligence of a sort, ubiquitous.
While the earliest computer systems were isolated entities, communicating only
with their human operators, computer systems today are usually interconnected.
They are networked into large distributed systems. The Internet is the obvious
example; it is becoming increasingly rare to find computers in use in commercial
or academic settings that do not have the capability to access the Internet. Until
a comparatively short time ago, distributed and concurrent systems were seen by
many as strange and difficult beasts, best avoided. The very visible and very rapid
growth of the Internet has (I hope) dispelled this belief forever. Today, and for the
future, distributed and concurrent systems are essentially the norm in commercial
and industrial computing, leading some researchers and practitioners to revisit
the very foundations of computer science, seeking theoretical models that better
reflect the reality of computing as primarily a process of interaction.
2 Introduction
The third trend is toward ever more intelligent systems. By t h s , I mean that the
complexity of tasks that we are capable of automating and delegating to comput-
ers has also grown steadily. We are gaining a progressively better understanding
of how to engineer computer systems to deal with tasks that would have been
unthinkable only a short time ago.
The next trend is toward ever increasing delegation. For example, we routinely
delegate to computer systems such safety critical tasks as piloting aircraft. Indeed,
in fly-by-wireaircraft, the judgement of a computer program is frequently trusted
over that of experienced pilots. Delegation implies that we give control to com-
puter systems.
The fifth and final trend is the steady move away from machine-oriented views
of programming toward concepts and metaphors that more closely reflect the
way in whch we ourselves understand the world. This trend is evident in every
way that we interact with computers. For example, in the earliest days of com-
puters, a user interacted with computer by setting switches on the panel of the
machine. The internal operation of the device was in no way hidden from the
user - in order to use it successfully, one had to fully understand the internal
structure and operation of the device. Such primitive - and unproductive - inter-
faces gave way to command line interfaces, where one could interact with the
device in terms of an ongoing dialogue, in which the user issued instructions
that were then executed. Such interfaces dominated until the 1980s, when they
gave way to graphical user interfaces, and the direct manipulation paradigm in
which a user controls the device by directly manipulating graphical icons cor-
responding to objects such as files and programs. Similarly, in the earliest days
of computing, programmers had no choice but to program their computers in
terms of raw machine code, which implied a detailed understanding of the internal
structure and operation of their machines. Subsequent programming paradigms
have progressed away from such low-level views: witness the development of
assembler languages, through procedural abstraction, to abstract data types, and
most recently, objects. Each of these developments have allowed programmers
to conceptualize and implement software in terms of hgher-level - more human-
oriented - abstractions.
These trends present major challenges for software developers. With respect
to ubiquity and interconnection, we do not yet know what techniques might be
used to develop systems to exploit ubiquitous processor power. Current software
development models have proved woefully inadequate even when dealing with
relatively small numbers of processors. What t e c h q u e s might be needed to deal
with systems composed of 10l0 processors? The term global computing has been
coined to describe such unimaginably large systems.
The trends to increasing delegation and intelligence imply the need to build
computer systems that can act effectively on our behalf. T h s in turn implies two
capabilities. The first is the ability of systems to operate independently, without
our direct intervention. The second is the need for computer systems to be able
Introduction 3
to act in such a way as to represent our best interests while interacting with other
humans or systems.
The trend toward interconnection and distribution has, in mainstream com-
puter science, long been recognized as a key challenge, and much of the intellec-
tual energy of the field throughout the last three decades has been directed toward
developing software tools and mechanisms that allow us to build distributed sys-
tems with greater ease and reliability. However, when coupled with the need for
systems that can represent our best interests, distribution poses other funda-
mental problems. When a computer system acting on our behalf must interact
with another computer system that represents the interests of another, it may
well be that (indeed, it is likely), that these interests are not the same. It becomes
necessary to endow such systems with the ability to cooperate and reach agree-
ments with other systems, in much the same way that we cooperate and reach
agreements with others in everyday life. This type of capability was not studied
in computer science until very recently.
Together, these trends have led to the emergence of a new field in computer
science: multiagent systems. The idea of a multiagent system is very simple. An
agent is a computer system that is capable of independent action on behalf of its
user or owner. In other words, an agent can figure out for itself what it needs to
do in order to satisfy its design objectives, rather than having to be told explicitly
what to do at any given moment. A multiagent system is one that consists of
a number of agents, which interact with one another, typically by exchanging
messages through some computer network infrastructure. In the most general
case, the agents in a multiagent system will be representing or acting on behalf of
users or owners with very different goals and motivations. In order to successfully
interact, these agents will thus require the ability to cooperate, coordinate, and
negotiate with each other, in much the same way that we cooperate, coordinate,
and negotiate with other people in our everyday lives.
This book is about multiagent systems. It addresses itself to the two key prob-
lems hinted at above.
How do we build agents that are capable of independent, autonomous action
in order to successfully carry out the tasks that we delegate to them?
How do we build agents that are capable of interacting (cooperating, coordi-
nating, negotiating) with other agents in order to successfully carry out the
tasks that we delegate to them, particularly when the other agents cannot
be assumed to share the same interests/goals?
The first problem is that of agent design, and the second problem is that of society
design. The two problems are not orthogonal - for example, in order to build a
society of agents that work together effectively, it may help if we give members
of the society models of the other agents in it. The distinction between the two
issues is often referred to as the micro/macro distinction. In the remainder of this
book, I address both of these issues in detail.
4 Introduction
motivation for what the agents community does. This motivation comes in the
style of long-term future visions - ideas about how thngs might be. A word of
caution: these visions are exactly that, visions. None is likely to be realized in the
immediate future. But for each of the visions, work is underway in developing the
lunds of technologies that might be required to realize them.
Due to an unexpected system failure, a space probe approachng Sat-
urn loses contact with its Earth-based ground crew and becomes disori-
ented. Rather than simply disappearing into the void, the probe recog-
nizes that there has been a key system failure, diagnoses and isolates
the fault, and correctly re-orients itself in order to make contact with
its ground crew.
They key issue here is the ability of the space probe to act autonomously. First
the probe needs to recognize that a fault has occurred, and must then figure out
what needs to be done and how to do it. Finally, the probe must actually do the
actions it has chosen, and must presumably monitor what happens in order to
ensure that all goes well. If more thngs go wrong, the probe will be required to
recognize t h s and respond appropriately. Notice that this is the lund of behaviour
that we (humans)find easy: we do it every day, when we miss a flight or have a flat
tyre whle driving to work. But, as we shall see, it is very hard to design computer
programs that exhlbit t h s lund of behaviour.
NASA's Deep Space 1 (DS1) mission is an example of a system that is close to
t h s lund of scenario. Launched from Cape Canaveral on 24 October 1998, DS1
was the first space probe to have an autonomous, agent-based control system
(Muscettola er al., 1998). Before DS1, space missions required a ground crew of
up to 300 staff to continually monitor progress. This ground crew made all neces-
sary control decisions on behalf of the probe, and painstalungly transmitted these
decisions to the probe for subsequent execution. Given the length of typical plan-
etary exploration missions, such a procedure was expensive and, if the decisions
were ever required quickly, it was simply not practical. The autonomous control
system in DS1 was capable of malung many important decisions itself. This made
the mission more robust, particularly against sudden unexpected problems, and
also had the very desirable side effect of reducing overall mission costs.
The next scenario is not quite down-to-earth,but is at least closer to home.
A key air-traffic control system at the main airport of Ruritania sud-
denly fails, leaving flight s in the vicinity of the airport with no air-traffic
control support. Fortunately, autonomous air-traffic control systems
in nearby airports recognize the failure of their peer, and cooperate
to track and deal with all affected flights. The potentially disastrous
situation passes without incident.
There are several key issues in t h s scenario. The first is the ability of systems to
take the initiative when circumstances dictate. The second is the ability of agents
6 Introduction
to cooperate to solve problems that are beyond the capabilities of any individ-
ual agents. The lund of cooperation required by t h s scenario was studied exten-
sively in the Distributed Vehicle Monitoring Testbed (DVMT) project undertaken
between 1981 and 1991 (see, for example, Durfee, 1988). The DVMT simulates
a network of vehicle monitoring agents, where each agent is a problem solver
that analyses sensed data in order to identify, locate, and track vehicles moving
through space. Each agent is typically associated with a sensor, whch has only a
partial view of the entire space. The agents must therefore cooperate in order to
track the progress of vehicles through the entire sensed space. Air-traffic control
systems have been a standard application of agent research since the work of
Cammarata and colleagues in the early 1980s (Cammarata et al., 1983); a recent
multiagent air-traffic control application is the OASIS system implemented for use
at Sydney airport in Australia (Ljunberg and Lucas, 1992).
Well, most of us are neither involved in designing the control systems for NASA
space probes, nor are we involved in the design of safety critical systems such as
air-traffic controllers. So let us now consider a vision that is closer to most of our
everyday lives.
After the wettest and coldest UK winter on record, you are in des-
perate need of a last minute holiday somewhere warm and dry. After
specifymg your requirements to your personal digital assistant (PDA),
it converses with a number of different Web sites, whch sell services
such as flights, hotel rooms, and h r e cars. After hard negotiation on
your behalf with a range of sites, your PDA presents you with a package
holiday.
This example is perhaps the closest of all four scenarios to actually being realized.
There are many Web sites that will allow you to search for last minute holidays,
but at the time of writing, to the best of my knowledge, none of them engages
in active real-time negotiation in order to assemble a package specifically for you
from a range of service providers. There are many basic research problems that
need to be solved in order to make such a scenario work; such as the examples
that follow.
How do you state your preferences to your agent?
How can your agent compare different deals from different vendors?
What algorithms can your agent use to negotiate with other agents (so as to
ensure you are not 'ripped off')?
The ability to negotiate in the style implied by this scenario is potentially very
valuable indeed. Every year, for example, the European Commission puts out thou-
sands of contracts to public tender. The bureaucracy associated with managing
this process has an enormous cost. The ability to automate the tendering and
negotiation process would save enormous sums of money (taxpayers' money!).
Similar situations arise in government organizations the world over - a good
Some Views of the Field 7
In multiagent systems, however, there are two important twists to the concur-
rent systems story.
First, because agents are assumed to be autonomous - capable of making
independent decisions about what to do in order to satisfy their design
objectives - it is generally assumed that the synchronization and coordi-
nation structures in a multiagent system are not hardwired in at design
time, as they typically are in standard concurrent/distributed systems.
We therefore need mechanisms that will allow agents to synchronize and
coordinate their activities at run time.
Second, the encounters that occur among computing elements in a multi-
agent system are economic encounters, in the sense that they are encounters
between self-interested entities. In a classic distributed/concurrent system,
all the computing elements are implicitly assumed to share a common goal
(of malung the overall system function correctly). In multiagent systems, it is
assumed instead that agents are primarily concerned with their own welfare
(although of course they will be acting on behalf of some user/owner).
For these reasons, the issues studied in the multiagent systems community have
a rather different flavour to those studied in the distributed/concurrent systems
community. We are concerned with issues such as how agents can reach agree-
ment through negotiation on matters of common interest, and how agents can
dynamically coordinate their activities with agents whose goals and motives are
unknown. (It is worth pointing out, however, that I see these issues as a natural
next step for distributed/concurrent systems research.)
(because our agent will surely need to learn, plan, and so on). This is not the
case. As Oren Etzioni succinctly put it: 'Intelligent agents are ninety-nine per-
cent computer science and one percent AI' (Etzioni, 1996). When we build
an agent to carry out a task in some environment, we will very likely draw
upon A1 techniques of some sort - but most of what we do will be standard
computer science and software engineering. For the vast majority of appli-
cations, it is not necessary that an agent has all the capabilities studied in
A1 - for some applications, capabilities such as learning may even be unde-
sirable. In short, whle we may draw upon A1 techniques to build agents, we
do not need to solve all the problems of A1 to build an agent.
Secondly, classical A1 has largely ignored the social aspects of agency. I hope
you will agree that part of what makes us unique as a species on Earth is not
simply our undoubted ability to learn and solve problems, but our ability to
communicate, cooperate, and reach agreements with our peers. These lunds
of social ability - whch we use every day of our lives - are surely just as
important to intelligent behaviour as are components of intelligence such
as planning and learning, and yet they were not studied in A1 until about
1980.
(Please note that all this should not be construed as a criticism of game theory,
whch is without doubt a valuable and important tool in multiagent systems, likely
to become much more widespread in use over the coming years.)
The social sciences are primarily concerned with understanding the behaviour of
human societies. Some social scientists are interested in (computational) multi-
agent systems because they provide an experimental tool with which to model
human societies. In addition, an obvious approach to the design of multiagent
systems - which are artificial societies - is to look at how a particular function
works in human societies, and try to build the multiagent system in the same way.
(An analogy may be drawn here with the methodology of AI, where it is quite com-
mon to study how humans achieve a particular kind of intelligent capability, and
then to attempt to model this in a computer program.) Is the multiagent systems
field therefore simply a subset of the social sciences?
Although we can usefully draw insights and analogies from human societies, it
does not follow that we can build artificial societies in exactly the same way. It
is notoriously hard to precisely model the behaviour of human societies, simply
because they are dependent on so many different parameters. Moreover, although
it is perfectly legitimate to design a multiagent system by drawing upon and mak-
ing use of analogies and metaphors from human societies, it does not follow that
this is going to be the best way to design a multiagent system: there are other
tools that we can use equally well (such as game theory - see above).
It seems to me that multiagent systems and the social sciences have a lot to say
to each other. Multiagent systems provide a powerful and novel tool for modelling
and understanding societies, while the social sciences represent a rich repository
of concepts for understanding and building multiagent systems - but they are
quite distinct disciplines.
12 Introduction
Exercises
(1) [Class discussion.]
Moore's law - a well-known dictum in computing - tells us that the number of tran-
sistors that it is possible to place on an integrated circuit doubles every 18 months. This
suggests that world's net processing capability is currently growing at an exponential rate.
Within a few decades, it seems likely that computers will outnumber humans by several
orders of magnitude - for every person on the planet there will be tens, hundreds, perhaps
thousands or millions of processors, linked together by some far distant descendant of
today's Internet. (This is not fanciful thinking: just extrapolate from the record of the past
five decades.)
In light of this, discuss the following.
What such systems might offer - what possibilities are there?
What are the challenges to make this vision happen?
Intelligent
Agents
The aim of this chapter is to give you an understanding of what agents are, and
some of the issues associated with building them. In later chapters, we will see
specific approaches to building agents.
An obvious way to open this chapter would be by presenting a definition of the
term agent. After all, t h s is a book about multiagent systems - surely we must all
agree on what an agent is? Sadly, there is no universally accepted definition of the
term agent, and indeed there is much ongoing debate and controversy on this very
subject. Essentially, while there is a general consensus that autonomy is central
to the notion of agency, there is little agreement beyond this. Part of the difficulty
is that various attributes associated with agency are of differing importance for
different domains. Thus, for some applications, the ability of agents to learn from
their experiences is of paramount importance; for other applications, learning is
not only unimportant, it is undesirable1.
Nevertheless, some sort of definition is important - otherwise, there is a danger
that the term will lose all meaning. The definition presented here is adapted from
Wooldridge and Jennings (199 5 ) .
An agent is a computer system that is situated in some environment,
and that is capable of autonomous action in this environment in order
to meet its design objectives.
l ~ i c h a e Georgeff,
l the main architect of the PRS agent system discussed in later chapters, gives
the example of an air-traffic control system he developed; the clients of the system would have been
horrified at the prospect of such a system modifying its behaviour at run time.. .
16 Intelligent Agents
sensor action
input output
Figure 2.1 An agent in its environment. The agent takes sensory input from the environ-
ment, and produces as output actions that affect it. The interaction is usualIy an ongoing,
non-terminating one.
Figure 2.1 gives an abstract view of an agent. In this diagram, we can see the
action output generated by the agent in order to affect its environment. In most
domains of reasonable complexity, an agent will not have complete control over
its environment. It will have at best partial control, in that it can influence it. From
the point of view of the agent, this means that the same action performed twice in
apparently identical circumstances might appear to have entirely different effects,
and in particular, it may fail to have the desired effect. Thus agents in all but the
most trivial of environments must be prepared for the possibility of failure. We
can sum this situation up formally by saying that environments are in general
assumed to be nondeterministic.
Normally, an agent will have a repertoire of actions available to it. This set of
possible actions represents the agents efectoric capability: its ability to modify
its environments. Note that not all actions can be performed in all situations. For
example, an action 'lift table' is only applicable in situations where the weight
of the table is sufficiently small that the agent can lift it. Similarly, the action
'purchase a Ferrari' will fail if insufficient funds are available to do so. Actions
therefore have preconditions associated with them, which define the possible sit-
uations in which they can be applied.
The key problem facing an agent is that of deciding which of its actions it
should perform in order to best satisfy its design objectives. Agent architectures,
of which we shall see many examples later in thls book, are really software
architectures for decision-malung systems that are embedded in an environment.
At this point, it is worth pausing to consider some examples of agents (though
not, as yet, intelligent agents).
Control systems
First, any control system can be viewed as an agent. A simple (and overused)
example of such a system is a thermostat. Thermostats have a sensor for detect-
Environments 17
ing room temperature. This sensor is directly embedded within the environment
(i.e. the room), and it produces as output one of two signals: one that indicates
that the temperature is too low, another which indicates that the temperature is
OK. The actions available to the thermostat are 'heating on' or 'heating off'. The
action 'heating on' will generally have the effect of raising the room temperature,
but this cannot be a guaranteed effect - if the door to the room is open, for exam-
ple, switching on the heater may have no effect. The (extremely simple) decision-
mahng component of the thermostat implements (usually in electro-mechanical
hardware) the following rules:
Software demons
Second, most software demons (such as background processes in the Unix operat-
ing system),which monitor a software environment and perform actions to modify
it, can be viewed as agents. An example is the X Windows program x b i ff. This
utility continually monitors a user's incoming email, and indicates via a GUI icon
whether or not they have unread messages. Whereas our thermostat agent in the
previous example inhabited a physical environment - the physical world - the
xb i f f program inhabits a software environment. It obtains information about
t h s environment by carrying out software functions (by executing system pro-
grams such as 1s, for example), and the actions it performs are software actions
(changing an icon on the screen, or executing a program). The decision-making
component is just as simple as our thermostat example.
To summarize, agents are simply computer systems that are capable of
autonomous action in some environment in order to meet their design objectives.
An agent will typically sense its environment (by physical sensors in the case of
agents situated in part of the real world, or by software sensors in the case of soft-
ware agents), and will have available a repertoire of actions that can be executed
to modify the environment, which may appear to respond non-deterministically
to the execution of these actions.
Environments
Russell and Norvig suggest the following classification of environment properties
(Russell and Norvig, 1995, p. 46).
18 Intelligent Agents
Clearly, deterministic environments are preferable from the point of view of the
agent designer to non-deterministic environments. If there is never any uncer-
tainty about the outcome of some particular action, then an agent need never
stop to determine whether or not a particular action had a particular outcome,
and thus whether or not it needs to reconsider its course of action. In particular,
in a deterministic environment, an agent designer can assume that the actions
performed by an agent will always succeed: they will never fail to bring about
their intended effect.
Unfortunately, as Russell and Norvig (1995) point out, if an environment is
sufficiently complex, then the fact that it is actually deterministic is not much
help. To all intents and purposes, it may as well be non-deterministic. In practice,
almost all realistic environments must be regarded as non-deterministic from an
agent's perspective.
Non-determinism is closely related to dynamism. Early artificial intelligence
research on action selection focused on planning algorithms - algorithms that,
given a description of the initial state of the environment, the actions available to
an agent and their effects, and a goal state, will generate a plan (i.e. a sequence
of actions) such that when executed from the initial environment state, the plan
will guarantee the achievement of the goal (Allen et a/., 1990). However, such
planning algorithms implicitly assumed that the environment in which the plan
was being executed was static - that it did not change except through the perfor-
mance of actions by the agent. Clearly, many environments (including software
environments such as computer operating systems, as well as physical environ-
ments such as the real world), do not enjoy this property - they are dynamic, with
many processes operating concurrently to modify the environment in ways that
an agent has no control over.
From an agent's point of view, dynamic environments have at least two impor-
tant properties. The first is that if an agent performs no external action between
times to and t l , then it cannot assume that the environment at tl will be the same
as it was at time to.This means that in order for the agent to select an appropriate
action to perform, it must perform information gathering actions to determine the
state of the environment (Moore, 1990).In a static environment, there is no need
for such actions. The second property is that other processes in the environment
can 'interfere' with the actions it attempts to perform. The idea is essentially the
concept of interference in concurrent systems theory (Ben-Ari, 1990). Thus if an
agent checks that the environment has some property q, and then starts execut-
ing some action a on the basis of this information, it cannot in general guarantee
that the environment will continue to have property q, while it is executing a.
These properties suggest that static environments will be inherently simpler to
design agents for than dynamic ones. First, in a static environment, an agent need
only ever perform information gathering actions once. Assuming the information
it gathers correctly describes the environment, and that it correctly understands
the effects of its actions, then it can accurately predict the effects of its actions
20 Intelligent Agents
on the environment, and hence how the state of the environment will evolve.
(This is in fact how most artificial intelligence planning algorithms work (Lifschitz,
1986).) Second, in a static environment, an agent never needs to worry about
synchronizing or coordinating its actions with those of other processes in the
environment (Bond and Gasser, 1988).
The final distinction made in Russell and Norvig (1995)is between discrete and
continuous environments. A discrete environment is one that can be guaranteed
to only ever be in a finite number of discrete states; a continuous one may be
in uncountably many states. Thus the game of chess is a discrete environment -
there are only a finite (albeit very large) number of states of a chess game. Russell
and Norvig (1995) give taxi driving as an example of a continuous environment.
Discrete environments are simpler to design agents for than continuous ones,
for several reasons. Most obviously, digital computers are themselves discrete-
state systems, and although they can simulate continuous systems to any desired
degree of accuracy, there is inevitably a mismatch between the two types of sys-
tems. Some information must be lost in the mapping from continuous environ-
ment to discrete representation of that environment. Thus the information a
discrete-state agent uses in order to select an action in a continuous environ-
ment will be made on the basis of information that is inherently approximate.
Finally, with finite discrete state environments, it is in principle possible to enu-
merate all possible states of the environment and the optimal action to perform
in each of these states. Such a lookup table approach to agent design is rarely
possible in practice, but it is at least in principle possible for finite, discrete state
environments.
In summary, the most complex general class of environments are those that
are inaccessible, non-deterministic, dynamic, and continuous. Environments that
have these properties are often referred to as open (Hewitt, 1986).
Environmental properties have a role in determining the complexity of the agent
design process, but they are by no means the only factors that play a part. The sec-
ond important property that plays a part is the nature of the interaction between
agent and environment.
Originally, software engineering concerned itself with what are known as 'func-
tional' systems. A functional system is one that simply takes some input, performs
some computation over this input, and eventually produces some output. Such
systems may formally be viewed as functions f : I - 0 from a set I of inputs
to a set 0 of outputs. The classic example of such a system is a compiler, which
can be viewed as a mapping from a set I of legal source programs to a set 0 of
corresponding object or machine code programs.
One of the key attributes of such functional systems is that they terminate.
This means that, formally, their properties can be understood in terms of pre-
conditions and postconditions (Hoare, 1969). The idea is that a precondition cp
represents what must be true of the program's environment in order for that pro-
gram to operate correctly. A postcondition rl/ represents what will be true of the
Environments 21
program's environment after the program terminates, assuming that the precon-
dition was satisfied when execution of the program commenced. A program is
said to be completely correct with respect to precondition cp and postcondition
rl/ if it is guaranteed to terminate when it is executed from a state where the pre-
condition is satisfied, and, upon termination, its postcondition is guaranteed to
be satisfied. Crucially, it is assumed that the agent's environment, as character-
ized by its precondition cp,is only modified through the actions of the program
itself. As we noted above, this assumption does not hold for many environments.
Although the internal complexity of a functional system may be great (e.g. in the
case of a compiler for a complex programming language such as Ada), functional
programs are, in general, comparatively simple to correctly and efficiently engi-
neer. For example, functional systems lend themselves to design methods based
on 'divide and conquer'. Top-down stepwise refinement (Jones, 1990) is an exam-
ple of such a method. Semi-automatic refinement techniques are also available,
which allow a designer to refine a high-level (formal) specification of a functional
system down to an implementation (Morgan, 1994).
Unfortunately, many computer systems that we desire to build are not func-
tional in this sense. Rather than simply computing a function of some input and
then terminating, many computer systems are reactive, in the following sense:
Reactive systems are systems that cannot adequately be described
by the relational or functional view. The relational view regards pro-
grams as functions. . .from an initial state to a terminal state. Typically,
the main role of reactive systems is to maintain an interaction with
their environment, and therefore must be described (and specified) in
terms of their on-going behaviour.. .[Elvery concurrent system.. .must
be studied by behavioural means. This is because each individual mod-
ule in a concurrent system is a reactive subsystem, interacting with its
own environment which consists of the other modules.
(Pnueli, 1986)
There are at least three current usages of the term reactive system in computer
science. The first, oldest, usage is that by Pnueli and followers (see, for exampIe,
Pnueli (1986),and the description above). Second, researchers in A1 planning take
a reactive system to be one that is capable of responding rapidly to changes in its
environment - here the word 'reactive' is taken to be synonymous with 'respon-
sive' (see, for example, Kaelbling, 1986). More recently, the term has been used to
denote systems which respond directly to the world, rather than reason explicitly
about it (see, for example, Connah and Wavish, 1990).
Reactive systems are harder to engineer than functional ones. Perhaps the
most important reason for this is that an agent engaging in a (conceptually)
non-terminating relationship with its environment must continually make locul
decisions that have global consequences. Consider a simple printer controller
agent. The agent continually receives requests to have access to the printer, and
22 Intelligent Agents
is allowed to grant access to any agent that requests it, with the proviso that it
is only allowed to grant access to one agent at a time. At some time, the agent
reasons that it will give control of the printer to process p l , rather than pz, but
that it will grant p2 access at some later time point. This seems like a reasonable
decision, when considered in isolation. But if the agent always reasons like this,
it will never grant p2 access. This issue is known as fairness (Francez, 1986). In
other words, a decision that seems entirely reasonable in a local context can have
undesirable effects when considered in the context of the system's entire hstory.
This is a simple example of a complex problem. In general, the decisions made
by an agent have long-term effects, and it is often difficult to understand such
long-term effects.
One possible solution is to have the agent explicitly reason about and predict
the behaviour of the system, and thus any temporally distant effects, at run-time.
But it turns out that such prediction is extremely hard.
Russell and Subramanian (1995) discuss the essentially identical concept of
episodic environments. In an episodic environment, the performance of an agent
is dependent on a number of discrete episodes, with no link between the perfor-
mance of the agent in different episodes. An example of an episodic environment
would be a mail sorting system (Russell and Subramanian, 1995).As with reactive
systems, episodic interactions are simpler from the agent developer's perspective
because the agent can decide what action to perform based only on the current
episode - it does not need to reason about the interactions between this and future
episodes.
Another aspect of the interaction between agent and environment is the con-
cept of real time. Put at its most abstract, a real-time interaction is simply one
in which time plays a part in the evaluation of an agent's performance (Russell
and Subramanian, 1995, p. 585). It is possible to identify several different types
of real-time interactions:
those in which a decision must be made about what action to perform within
some specified time bound;
those in which the agent must bring about some state of affairs as quickly
as possible;
those in which an agent is required to repeat some task, with the objective
being to repeat the task as often as possible.
If time is not an issue, then an agent can deliberate for as long as required in order
to select the 'best' course of action in any given scenario. Selecting the best course
of action implies search over the space of all possible courses of action, in order
to find the 'best'. Selecting the best action in this way will take time exponential in
the number of actions available to the agent2.It goes without saying that for any
'1f the agent has n actions available to it, then it has n! courses of action available to it (assuming
no duplicate actions).
Intelligent Agents 23
r
realistic environment, such deliberation is not viable. Thus any realistic system
must be regarded as real-time in some sense.
Some environments are real-time in a much stronger sense than this. For exam-
ple, the PRS, one of the best-known agent systems, had fault diagnosis on NASA's
Space Shuttle as its initial application domain (Georgeff and Lansky, 1987). In
order to be of any use, decisions in such a system must be made in milliseconds.
Objects are defined as computational entities that encapsulate some state, are
able to perform actions, or methods on this state, and communicate by message
passing. While there are obvious similarities, there are also significant differences
between agents and objects. The first is in the degree to which agents and objects
are autonomous. Recall that the defining characteristic of object-oriented pro-
gramming is the principle of encapsulation - the idea that objects can have con-
trol over their own internal state. In programming languages like Java, we can
declare instance variables (and methods) to be p r i v a t e , meaning they are only
accessible from withn the object. (We can of course also declare them p u b l ic,
meaning that they can be accessed from anywhere, and indeed we must do this
for methods so that they can be used by other objects. But the use of p u b l i c
instance variables is usually considered poor programming style.) In t h s way, an
object can be thought of as exhibiting autonomy over its state: it has control over
it. But an object does not exhibit control over its behaviour. That is, if a method rn
is made available for other objects to invoke, then they can do so whenever they
wish - once an object has made a method p u b l i c, then it subsequently has no
control over whether or not that method is executed. Of course, an object must
make methods available to other objects, or else we would be unable to build a
26 Intelligent Agents
system out of them. This is not normally an issue, because if we build a system,
then we design the objects that go in it, and they can thus be assumed to share a
'common goal'. But in many types of multiagent system (in particular, those that
contain agents built by different organizations or individuals), no such common
goal can be assumed. It cannot be taken for granted that an agent i will execute
an action (method) a just because another agent j wants it to - a may not be in
the best interests of i. We thus do not think of agents as invoking methods upon
one another, but rather as requesting actions to be performed. If j requests i to
perform a, then i may perform the action or it may not. The locus of control with
respect to the decision about whether to execute an action is thus different in
agent and object systems. In the object-oriented case, the decision lies with the
object that invokes the method. In the agent case, the decision lies with the agent
that receives the request. This distinction between objects and agents has been
nicely summarized in the following slogan.
Objects do it for free; agents do it because they want to.
Of course, there is nothing to stop us implementing agents using object-oriented
techniques. For example, we can build some kind of decision making about
whether to execute a method into the method itself, and in this way achieve a
stronger lund of autonomy for our objects. The point is that autonomy of this
lund is not a component of the basic object-oriented model.
The second important distinction between object and agent systems is with
respect to the notion of flexible (reactive, proactive, social) autonomous be-
haviour. The standard object model has nothing whatsoever to say about how to
build systems that integrate these types of behaviour. Again, one could object that
we can build object-oriented programs that do integrate these types of behaviour.
But this argument misses the point, which is that the standard object-oriented
programming model has nothing to do with these types of behaviour.
The third important distinction between the standard object model and our
view of agent systems is that agents are each considered to have their own thread
of control - in the standard object model, there is a single thread of control in
the system. Of course, a lot of work has recently been devoted to concurrency in
object-oriented programming. For example, the Java language provides built-in
constructs for multi-threaded programming. There are also many programming
languages available (most of them admittedly prototypes) that were specifically
designed to allow concurrent object-based programming. But such languages do
not capture the idea of agents as autonomous entities. Perhaps the closest that
the object-oriented community comes is in the idea of active objects.
An active object is one that encompasses its own thread of control.. ..
Active objects are generally autonomous, meaning that they can exlubit
some behaviour without being operated upon by another object. Pas-
sive objects, on the other hand, can only undergo a state change when
explicitly acted upon.
Agents and Expert Systems 27
Thus active objects are essentially agents that do not necessarily have the ability
to e h b i t flexible autonomous behaviour.
To summarize, the traditional view of an object and our view of an agent have
at least three distinctions:
agents embody a stronger notion of autonomy than objects, and, in partic-
ular, they decide for themselves whether or not to perform an action on
request from another agent;
agents are capable of flexible (reactive, proactive, social) behaviour, and the
standard object model has nothing to say about such types of behaviour;
and
a multiagent system is inherently multi-threaded, in that each agent is
assumed to have at least one thread of control.
Despite these differences, some expert systems (particularly those that perform
real-time control tasks) look very much like agents. A good example is the
ARCHON system, discussed in Chapter 9 (Jennings et al., 1996a).
What objects can be described by the intentional stance? As it turns out, almost
any automaton can. For example, consider a light switch as follows.
It is perfectly coherent to treat a light switch as a (very cooperative)
agent with the capability of transmitting current at will, who invariably
transmits current when it believes that we want it transmitted and not
otherwise; flicking the switch is simply our way of communicating our
desires.
(Shoham, 1990, p. 6)
And yet most adults in the modern world would find such a description absurd -
perhaps even infantile. Why is this? The answer seems to be that while the inten-
tional stance description is perfectly consistent with the observed behaviour of a
light switch, and is internally consistent,
. . .it does not buy us anything, since we essentially understand the
mechanism sufficiently to have a simpler, mechanistic description of
its behaviour.
(Shoham, 1990, p. 6)
Put crudely, the more we know about a system, the less we need to rely on ani-
mistic, intentional explanations of its behaviour - Shoham observes that the move
from an intentional stance to a technical description of behaviour correlates well
with Piaget's model of child devclopmcnt, and with thc scientific dcvclopment
of humanlund generally (Shoham, 1990). Children will use animistic explanations
of objects - such as light switches - until they grasp the more abstract techni-
cal concepts involved. Similarly, the evolution of science has been marked by a
gradual move from theological/animistic explanations to mathematical ones. My
30 Intelligent Agents
Another alternative is the design stance. With the design stance, we use knowledge
of what purpose a system is supposed to fulfil in order to predict how it behaves.
Dennett gives the example of an alarm clock (see pp. 37-39 of Dennett, 1996).
When someone presents us with an alarm clock, we do not need to make use of
physical laws in order to understand its behaviour. We can simply make use of
the fact that all alarm clocks are designed to wake people up if we set them with
a time. No understanding of the clock's mechanism is required to justify such an
understanding - we know that all alarm clocks have this behaviour.
However, with very complex systems, even if a complete, accurate picture of the
system's architecture and working is available, a physical or design stance expla-
nation of its behaviour may not be practicable. Consider a computer. Although we
might have a complete technical description of a computer available, it is hardly
practicable to appeal to such a description when explaining why a menu appears
when we click a mouse on an icon. In such situations, it may be more appropriate
to adopt an intentional stance description, if that description is consistent, and
simpler than the alternatives.
Note that the intentional stance is, in computer science terms, nothing more
than an abstraction tool. It is a convenient shorthand for talking about complex
systems, which allows us to succinctly predict and explain their behaviour without
having to understand how they actually work. Now, much of computer science is
concerned with looking for good abstraction mechanisms, since these allow sys-
tem developers to manage complexity with greater ease. The history of program-
ming languages illustrates a steady move away from low-level machine-oriented
views of programming towards abstractions that are closer to human experience.
Procedural abstraction, abstract data types, and, most recently, objects are exam-
ples of this progression. So, why not use the intentional stance as an abstraction
Abstract Architectures for Intelligent Agents 31
Let
R be the set of all such possible finite sequences (over E and Ac);
RACbe the subset of these that end with an action; and
aEbe the subset of these that end with an environment state.
We will use v ,r f ,. . . to stand for members of R.
In order to represent the effect that an agent's actions have on an environment,
we introduce a state transformer function (cf. Fagin et al., 1995, p. 154):
32 Intelligent Agents
Thus a state transformer function maps a run (assumed to end with the action of
an agent) to a set of possible environment states - those that could result from
performing the action.
There are two important points to note about this definition. First, environ-
ments are assumed to be history dependent. In other words, the next state of
an environment is not solely determined by the action performed by the agent
and the current state of the environment. The actions made earlier by the agent
also play a part in determining the current state. Second, note that this definition
allows for non-determinism in the environment. There is thus unccrtuinty about
the result of performing an action in some state.
If T ( Y ) = 0 (where Y is assumed to end with an action), then there are no
possible successor states to r. In t h s case, we say that the system has ended its
run. We will also assume that all runs eventually terminate.
Formally, we say an environment Env is a triple Env = (E,eo, T ) , where E is a
set of environment states, eo E E is an initial state, and T is a state transformer
function.
We now need to introduce a model of the agents that inhabit systems. We model
agents as functions which map runs (assumed to end with an environment state)
to actions (cf. Russell and Subramanian, 1995, pp. 580, 581):
Thus an agent makes a decision about what action to perform based on the history
of the system that it has witnessed to date.
Notice that whle environments are implicitly non-deterministic, agents are
assumed to be deterministic. Let ACj be the set of all agents.
We say a system is a pair containing an agent and an environment. Any system
will have associated with it a set of possible runs; we denote the set of runs of
agent A g in environment Env by 'R(Ag,Env). For simplicity, we will assume that
R(Acq,E n v ) contains only terminated runs, i.e. runs r such that r has no possible
successor states: T ( Y ) = 0. (We will thus not consider infinite runs for now.)
Formally, a sequence
(eo,~o,Ql,al,e2,...)
represents a run of an agent Ag in environment Env = (E, eo, T ) if
(1) eo is the initial state of Env;
where
Abstract Architectures for Intelligent Agents 33
Two agents Agl and Ag2 are said to be behaviourally equivalent with respect
to environment E nv if and only if R ( A g l ,Env) = R(Ag2,Env ), and simply
behaviourally equivalent if and only if they are behaviourally equivalent with
respect to all environments.
Notice that so far, I have said nothing at all about how agents are actually imple-
mented; we will return to this issue later.
Perception
Viewing agents at this abstract level makes for a pleasantly simple analysis. How-
ever, it does not help us to construct them. For this reason, we will now begin
to refine our abstract model of agents, by breaking it down into sub-systems in
exactly the way that one does in standard software engineering. As we refine our
view of agents, we find ourselves making design choices that mostly relate to the
subsystems that go to make up an agent - what data and control structures will be
present. An agent urchitecture is essentially a map of the internals of an agent - its
data structures, the operations that may be performed on these data structures,
and the control flow between these data structures. Later in this book, we will dis-
cuss a number of different types of agent architecture, with very different views
on the data structures and algorithms that will be present within an agent. In the
remainder of this section, however, we will survey some fairly high-level design
decisions. The first of these is the separation of an agent's decision function into
perception and action subsystems: see Figure 2.2.
34 Intelligent Agents
AGENT
/J
The idea is that the function see captures the agent's ability to observe its envi-
ronment, whereas the a c t i o n function represents the agent's decision-making
process. The see function might be implemented in hardware in the case of an
agent situated in the physical world: for example, it might be a video camera or
an infrared sensor on a mobile robot. For a software agent, the sensors might
be system commands that obtain information about the software environment,
such as 1s, f i n g e r , or suchlike. The output of the see function is a percept - a
perceptual input. Let Per be a (non-empty)set of percepts. Then see is a function
see : E - Per
which maps environment states to percepts, and a c t i o n is a function
a c t i o n : Per* - Ac
w h c h maps sequences of percepts to actions. An agent Ag is now considered to
be a pair Ag = ( s e e ,a c t i o n ) , consisting of a see function and an a c t i o n function.
These simple definitions allow us to explore some interesting properties of
agents and perception. Suppose that we have two environment states, el E E and
e2 E E, such that el + e2, but s e e r e l ) = s e e ( e 2 ) .Then two different environ-
ment states are mapped to the same percept, and hence the agent would receive
the same perceptual information from different environment states. As far as
the agent is concerned, therefore, el and e2 are indistinguishable. To make t h s
example concrete, let us return to the thermostat example. Let x represent the
statement
'the room temperature is OK'
and let y represent the statement
'John Major is Prime Minister'.
Abstract Architectures for Intelligent Agents 35
If these are the only two facts about our environment that we are concerned
with, then the set E of environment states contains exactly four elements:
Thus in state el, the room temperature is not OK, and John Major is not Prime
Minister; in state e2, the room temperature is not OK, and John Major is Prime
Minister. Now, our thermostat is sensitive only to temperatures in the room. This
room temperature is not causally related to whether or not John Major is Prime
Minister. Thus the states where John Major is and is not Prime Minister are literally
indistinguishable to the thermostat. Formally, the see function for the thermostat
would have two percepts in its range, pl and p;l, indicating that the temperature
is too cold or OK, respectively. The see function for the thermostat would behave
as follows:
see(e) =
AGENT 'I
these tasks: how to tell the agent what to do. One way to specify the task would
be simply to write a program for the agent to execute. The obvious advantage of
this approach is that we are left in no uncertainty about what the agent will do; it
will do exactly what we told it to, and no more. But the very obvious disadvantage
is that we have to t h n k about exactly how the task will be carried out ourselves -
if unforeseen circumstances arise, the agent executing the task will be unable
to respond accordingly. So, more usually, we want to tell our agent what to do
without telling it how to do it. One way of doing t h s is to define tasks indirectly,
via some lund of performance measure. There are several ways in which such a
performance measure can be defined. The first is to associate utilities with states
of the environment.
Utiliw functions
A utility is a numeric value representing how 'good' the state is: the higher the
utility, the better. The task of the agent is then to bring about states that maximize
utility - we do not specify to the agent how this is to be done. In this approach, a
task specification would simply be a function
which associates a real value with every environment state. Given such a perfor-
mance measure, we can then define the overall utility of an agent in some partic-
ular environment in several different ways. One (pessimistic) way is to define the
utility of the agent as the utility of the worst state that might be encountered by
the agent; another might be to define the overall utility as the average utility of all
states encountered. There is no right or wrong way: the measure depends upon
the lund of task you want your agent to carry out.
The main disadvantage of this approach is that it assigns utilities to local states;
it is &fficult to specify a long-term view when assigning utilities to individual
states. To get around this problem, we can specify a task as a function which
assigns a utility not to individual states, but to runs themselves:
If we are concerned with agents that must operate independently over long peri-
ods of time, then t h s approach appears more appropriate to our purposes. One
well-known example of the use of such a utility function is in the Tileworld
(Pollack, 1990). The Tileworld was proposed primarily as an experimental envi-
ronment for evaluating agent architectures. It is a simulated two-dimensional grid
environment on w h c h there are agents, tiles, obstacles, and holes. An agent can
move in four directions, up, down, left, or right, and if it is located next to a tile, it
can push it. An obstacle is a group of immovable grid cells: agents are not allowed
to travel freely through obstacles. Holes have to be filled up with tiles by the agent.
An agent scores points by filling holes with tiles, with the aim being to fill as many
38 Intelligent Agents
Figure 2.4 Three scenarios in the Tileworld are (a) the agent detects a hole ahead, and
begins to push a tile towards it; (b) the hole disappears before the agent can get to it -
the agent should recognize this change in the environment, and modify its behaviour
appropriately; and (c) the agent was pushing a tile north, when a hole appeared to its
right; it would do better to push the tile to the right, than to continue to head north.
right of the agent (Figure 2.4(c)).The agent is more likely to be able to fill this hole
I
than its originally planned one, for the simple reason that it only has to push the
tile one step, rather than four. All other things being equal, the chances of the
hole on the right still being there when the agent arrives are four times greater.
I Assuming that the utility function u has some upper bound to the utilities it
assigns (i.e. that there exists a k E R such that for all r E R, we have u ( r ) < k),
then we can talk about optimal agents: the optimal agent is the one that maximizes
expected utility.
Let us write P ( r I Ag, Env ) to denote the probability that run r occurs when
agent A g is placed in environment Env. Clearly,
Then the optimal agent Agoptin an environment E n v is defined as the one that
maximizes expected utility:
Ago,[ = arg max 2 u ( r ) P ( rI Ag,Env).
YEK(&,E~V)
-
function u : R R. Then we can replace Equation (2.1) with the following, which
more precisely defines the properties of the desired agent Agopi:
The subtle change in (2.2) is that we are no longer looking for our agent from the
set of all possible agents AG, but from the set AG, of agents that can actually
be implemented on the machne that we have for the task.
Utility-based approaches to specifying tasks for agents have several disadvan-
tages. The most important of these is that it is very often difficult to derive an
appropriate utility function; the Tileworld is a useful environment in which to
experiment with agents, but it represents a gross simplification of real-world sce-
narios. The second is that usually we find it more convenient to talk about tasks
in terms of 'goals to be achieved' rather than utilities. This leads us to what I call
predicate task specifications.
Task environments
A task environment is defined to be a pair ( E n v , Y ) , where E n v is anenvironment,
and
Y : R -- {O, 1)
is a predicate over runs. Let I T be the set of all task environments. A task envi-
ronment thus specifies:
the properties of the system the agent will inhabit (i.e. the environment Env);
and also
the criteria by w h c h an agent will be judged to have either failed or suc-
ceeded in its task (i.e. the specification Y).
Given a task environment (Env,Y), we write Rlu(Ag,E n v ) to denote the set of
all runs of the agent Ag in the environment E n v that satisfy Y. Formally,
Rlv(Ag,E n v ) = {rI r E R ( A g , E n v ) and Y ( r ) ) .
How to Tell an Agent What to Do 41
The notion of a predicate task specification may seem a rather abstract way of
describing tasks for an agent to carry out. In fact, it is a generalization of certain
very common forms of tasks. Perhaps the two most common types of tasks that
we encounter are achievement tasks and maintenance tasks.
(I)Achievement tasks. Those of the form 'achieve state of affairs cp'.
(2) Maintenance tasks. Those of the form 'maintain state of affairs q'.
Intuitively, an achevement task is specified by a number of goal states; the agent
is required to bring about one of these goal states (we do not care w h c h one - all
are considered equally good). Achievement tasks are probably the most commonly
studied form of task in AI. Many well-known A1 problems (e.g. the Blocks World)
are achievement tasks. A task specified by a predicate Y is an achevement task if
we can identify some subset Cj of environment states E such that Y ( r )is true just
in case one or more of Cj occur in r; an agent is successful if it is guaranteed to
bring about one of the states Cj, that is, if every run of the agent in the environment
results in one of the states Cj.
Formally, the task environment (Env, Y ) specifies an achevement task if and
only if there is some set Cj c E such that for all r E R ( A g ,E n v ) , the predicate
Y ( r ) is true if and only if there exists some e E Cj such that e E r. We refer to
42 Intelligent Agents
the set G of an achievement task environment as the goal states of the task; we
use ( E n v ,G) to denote an achievement task environment with goal states C j and
environment Env.
A useful way to t h n k about achievement tasks is as the agent playing a game
against the environment. In the terminology of game theory (Binmore, l992), t h s
is exactly what is meant by a 'game against nature'. The environment and agent
both begin in some state; the agent takes a turn by executing an action, and
the environment responds with some state; the agent then takes another turn,
and so on. The agent 'wins' if it can force the environment into one of the goal
states Cj.
Just as many tasks can be characterized as problems where an agent is required
to bring about some state of affairs, so many others can be classified as problems
where the agent is required to avoid some state of affairs. As an extreme example,
consider a nuclear reactor agent, the purpose of which is to ensure that the reactor
never enters a 'meltdown' state. Somewhat more mundanely, we can imagine a
software agent, one of the tasks of which is to ensure that a particular file is
never simultaneously open for both reading and writing. We refer to such task
environments as maintenance task environments.
A task environment with specification Y is said to be a maintenance task envi-
ronment if we can identify some subset 3 of environment states, such that Y (r)
is false if any member of 3 occurs in r, and true otherwise. Formally, ( E n v , Y)
is a maintenance task environment if there is some 23 c E such that Y ( r )if
and only if for all e E 23, we have e q! r for all r E R ( A g ,E n v ). We refer to
3 as the failure set. As with achevement task environments, we write (Env, 3 )
to denote a maintenance task environment with environment Env and failure
set 23.
It is again useful to think of maintenance tasks as games. T h s time, the agent
wins if it manages to avoid all the states in 23. The environment, in the role of
opponent, is attempting to force the agent into 3 ; the agent is successful if it has
a winning strategy for avoiding 3.
More complex tasks might be specified by combinations of achevement and
maintenance tasks. A simple combination might be 'acheve any one of states
whle avoiding all states 23'. More complex combinations are of course also
possible.
Synthesizing Agents
Knowing that there exists an agent whch will succeed in a given task environment
is helpful, but it would be more helpful if, knowing this, we also had such an agent
to hand. How do we obtain such an agent? The obvious answer is to 'manually'
implement the agent from the specification. However, there are at least two other
possibilities (see Wooldridge (1997) for a discussion):
Synthesizing Agents 43
(1) we can try to develop an algorithm that will automatically synthesize such
agents for us from task environment specifications; or
(2) we can try to develop an algorithm that will directly execute agent specifica-
tions in order to produce the appropriate behaviour.
In this section, I briefly consider these possibilities, focusing primarily on agent
synthesis.
Agent synthesis is, in effect, automatic programming: the goal is to have a pro-
gram that will take as input a task environment, and from this task environment
automatically generate an agent that succeeds in this environment. Formally, an
agent synthesis algorithm s y n can be understood as a function
s y n : 7 ' F + ( A G u {L}).
Note that the function s y n can output an agent, or else output I - think of L as
being like nu1 1 in Java. Now, we will say a synthesis algorithm is
sound if, whenever it returns an agent, this agent succeeds in the task environ-
ment that is passed as input; and
complete if it is guaranteed to return an agent whenever there exists an agent
that will succeed in the task environment given as input.
Thus a sound and complete synthesis algorithm will only output I given input
(Env, Y ) when no agent exists that will succeed in ( E n v ,Y ) .
Formally, a synthesis algorithm s y n is sound if it satisfies the following condi-
tion:
s y n ( ( E n v ,Y ) ) = A g implies R ( A g , E n v ) = R y ( A g , E n v ) .
Similarly, s y n is complete if it satisfies the following condition:
(1995) presents much useful material. The definition of agents presented here is
based on Wooldridge and Jennings (1995),which also contains an extensive review
of agent architectures and programming languages. The question of 'what is an
agent' is one that continues to generate some debate; a collection of answers may
be found in Muller et ul. (1997).The relationship between agents and objects has
not been widely discussed in the literature, but see Gasser and Briot (1992).Other
interesting and readable introductions to the idea of intelligent agents include
Kaelbling (1986) and Etzioni (1993).
The abstract model of agents presented here is based on that given in Gene-
sereth and Nilsson (1987, Chapter 13), and also makes use of some ideas from
Russell and Wefald (1991) and Russell and Subramanian (1995).The properties of
perception as discussed in this section lead to knowledge theory, a formal analy-
sis of the information implicit within the state of computer processes, which has
had a profound effect in theoretical computer science: this issue is discussed in
Chapter 12.
The relationship between artificially intelligent agents and software complexity
has been discussed by several researchers: Simon (1981) was probably the first.
More recently, Booch (1994) gives a good discussion of software complexity and
the role that object-oriented development has to play in overcoming it. Russell
and Norvig (1995) introduced the five-point classification of environments that we
reviewed here, and distinguished between the 'easy' and 'hard' cases. Kaelbling
(1986) touches on many of the issues discussed here, and Jennings (1999) also
discusses the issues associated with complexity and agents.
The relationship between agent and environment, and, in particular, the prob-
lem of understanding how a given agent will perform in a given environment,
has been studied empirically by several researchers. Pollack and Ringuette (1990)
introduced the Tileworld, an environment for experimentally evaluating agents
that allowed a user to experiment with various environmental parameters (such
as the rate at which the environment changes - its dynamism). Building on this
work, Kinny and Georgeff (1991)investigated how a specific class of agents, based
on the belief-desire-intention model (Wooldridge, 2000b), could be tailored to per-
form well in environments with differing degrees of change and complexity. An
attempt to prove some results corresponding to Kinny and Georgeff (1991) was
Wooldridge and Parsons (1999);an experimental investigation of some of these
relationships, building on Kinny and Georgeff (1991), was Schut and Wooldridge
(2000). An informal discussion on the relationshp between agent and environ-
ment is Muller (1999).
In artificial intelligence, the planning problem is most closely related to
achievement-based task environments (Allen et al., 1990).STRIPS was the archety-
pal planning system (Fikes and Nilsson, 1971). The STRIPS system is capable of
taking a description of the initial environment state eo, a specification of the goal
to be acheved, Egood,and the actions A c available to an agent, and generates a
sequence of actions n. E Ac* such that when executed from eo, KT will achieve
Synthesizing Agents 45
one of the states Egood.The initial state, goal state, and actions were characterized
in STRIPS using a subset of first-order logic. Bylander showed that the (proposi-
tional) STRIPS decision problem (given eo, Ac, and Egoodspecified in propositional
logic, does there exist a n. E Ac* such that 71- achieves Egood?) is PSPACE-complete
(Bylander, 1994).
More recently, there has been renewed interest by the artificial intelligence plan-
ning community in decision theoretic approaches to planning (Blythe, 1999). One
popular approach involves representing agents and their environments as 'par-
tially observable Markov decision processes' (POMDPs)(Kaelbling et ul., 1998).Put
simply, the goal of solving a POMDP is to determine an optimal policy for acting in
an environment in which there is uncertainty about the environment state (cf. our
visibility function), and whch is non-deterministic. Work on POMDP approaches
to agent design are at an early stage, but show promise for the future.
The discussion on task specifications is adapted from Wooldridge (2000a) and
Wooldridge and Dunne (2000).
Exercises
(1)[Level 1.]
Give other examples of agents (not necessarily intelligent) that you know of. For each,
define as precisely as possible the following.
(1) The environment that the agent occupies (physical, software, etc.), the states that
this environment can be in, and whether the environment is: accessible or inaccessi-
ble; deterministic or non-deterministic; episodic or non-episodic; static or dynamic;
discrete or continuous.
(2) The action repertoire available to the agent, and any preconditions associated with
these actions.
(3) The goal, or design objectives of the agent - what it is intended to achieve.
( 2) [Level 1.]
Prove the following.
(1) For every purely reactive agent, there is a behaviourally equivalent standard agent.
(2) There exist standard agents that have no behaviourally equivalent purely reactive
agent.
Interp
Pixel stuff
DOOR TO ROOM 3.07
Knowledge basel
beliefs:
Dist(me, d I ) = 3ft
Door(d 1 )
Action
Brake!
Figure 3.1 A robotic agent that contains a symbolic description of its environment.
It is not difficult to see how formulae such as these can be used to represent the
properties of some environment. The database is the information that the agent
has about its environment. An agent's database plays a somewhat analogous role
to that of belief in humans. Thus a person might have a belief that valve 2 2 1 is
open - the agent might have the predicate O p e n ( v a l v e 2 2 1 ) in its database. Of
course, just like humans, agents can be wrong. Thus I might believe that valve 221
is open when it is in fact closed; the fact that an agent has O p e n ( v a l v e 2 2 1 ) in its
database does not mean that valve 221 (or indeed any valve) is open. The agent's
sensors may be faulty, its reasoning may be faulty, the information may be out
of date, or the interpretation of the formula O p e n ( v a l v e 2 2 1 )intended by the
agent's designer may be something entirely different.
Let L be the set of sentences of classical first-order logic, and let D = , p ( L ) be
the set of L databases, i.e. the set of sets of L-formulae. The internal state of an
agent is then an element of D. We write A, A1,. . . for members of D. An agent's
decision-malung process is modelled through a set of deduction rules, p . These
are simply rules of inference for the logic. We write A I-,g? if the formula cp
can be proved from the database A using only the deduction rules p. An agent's
50 Deductive Reasoning Agents
F u n c t i on : A c t i o n Sel e c t i o n as Theorem P r o v i ng
1. f u n c t i o n a c t i o n ( A : D ) r e t u r n s an a c t i o n A c
2. begin
3. f o r each oc E A C do
4. i f A F, Do(oc) t h e n
5. r e t u r n oc
6. end-if
7. end-for
8. f o r each o c A~c do
9. i f A #, -Do(oc) t h e n
10. r e t u r n oc
11. end-i f
12. end-for
13. return null
14. end f u n c t i o n a c t i o n
see : S - Per.
Similarly, our n e x t function has the form
n e x t :D x P e r - D.
It thus maps a database and a percept to a new database. However, an agent's
action selection function, which has the signature
action :D - Ac,
is defined in terms of its deduction rules. The pseudo-code definition of this func-
tion is given in Figure 3.2.
The idea is that the agent programmer will encode the deduction rules p and
database A in such a way that if a formula D o ( @ ) can be derived, where oc
is a term that denotes an action, then oc is the best action to perform. Thus,
in the first part of the function (lines (3)-(7)),the agent takes each of its pos-
sible actions oc in turn, and attempts to prove the formula Do(oc) from its
database (passed as a parameter to the function) using its deduction rules p.
If the agent succeeds in proving Do(oc), then oc is returned as the action to be
performed.
What happens if the agent fails to prove Do(oc), for all actions a E Ac? In t h s
case, it attempts to find an action that is consistent with the rules and database,
i.e. one that is not explicitly forbidden. In lines (8)-(12), therefore, the agent
attempts to find an action a E A c such that 1 D o ( o c ) cannot be derived from
Agents as Theorem Provers 51
dirt dirt
its database using its deduction rules. If it can find such an action, then t h s is
returned as the action to be performed. If, however, the agent fails to find an
action that is at least consistent, then it returns a special action n u l l (or noop),
indicating that no action has been selected.
In this way, the agent's behaviour is determined by the agent's deduction rules
(its 'program') and its current database (representing the information the agent
has about its environment).
To illustrate these ideas, let us consider a small example (based on the vacuum
cleaning world example of Russell and Norvig (1995, p. 51)). The idea is that we
have a small robotic agent that will clean up a house. The robot is equipped with a
sensor that will tell it whether it is over any dirt, and a vacuum cleaner that can be
used to suck up dirt. In addition, the robot always has a definite orientation (one of
n o r t h , r o u t h, e a s t , or west). In addition to being able to suck up dirt, the agent
can move forward one 'step' or turn right 90". The agent moves around a room,
which is divided grid-like into a number of equally sized squares (conveniently
corresponding to the unit of movement of the agent). We will assume that our
agent does nothing but clean - it never leaves the room, and further, we will
assume in the interests of simplicity that the room is a 3 x 3 grid, and the agent
always starts in grid square (0,O) facing north.
To summarize, our agent can receive a percept d i r t (signifying that there is dirt
beneath it), or n u l l (indicating no special information). It can perform any one of
three possible actions: f o r w a r d , suck, or t u r n . The goal is to traverse the room
continually searching for and removing dirt. See Figure 3.3 for an illustration of
the vacuum world.
52 Deductive Reasoning Agents
First, note that we make use of three simple domain predicates in this exercise:
Now we specify our n e x t function. This function must look at the perceptual
information obtained from the environment (either d i r t or null), and generate a
new database whichincludes this information. But, in addition, it must remove old
or irrelevant information, and also, it must try to figure out the new location and
orientation of the agent. We will therefore specify the n e x t function in several
parts. First, let us write old(A) to denote the set of 'old' information in a database,
which we want the update function n e x t to remove:
Next, we require a function new, which gives the set of new predicates to add to
the database. This function has the signature
Now we can move on to the rules that govern our agent's behaviour. The deduction
rules have the form
cp(-..) W ( . .), - *
where cp and q are predicates over some arbitrary list of constants and variables.
The idea being that if cp matches against the agent's database, then (CI can be
concluded, with any variables in q instantiated.
The first rule deals with the basic cleaning action of the agent: this rule will take
priority over all other possible behaviours of the agent (such as navigation):
Hence, if the agent is at location (x,y ) and it perceives dirt, then the prescribed
action will be to suck up dirt. Otherwise, the basic action of the agent will be to
traverse the world. Takmg advantage of the simplicity of our environment, we will
hardwire the basic navigation algorithm, so that the robot will always move from
(0,O) to ( 0 , l )to (0,Z) and then to (1,2),( 1 , 1 ) and so on. Once the agent reaches
Agents a s Theorem Provers 53
Agent-Oriented Programming
Yoav Shoham has proposed a 'new programming paradigm, based on a societal
view of computation' w h c h he calls agent-oriented programming. The key idea
which informs AOP is that of directly programming agents in terms of mentalistic
notions (such as belief, desire, and intention) that agent theorists have developed
to represent the properties of agents. The motivation b e h n d the proposal is that
humans use such concepts as an abstraction mechanism for representing the
properties of complex systems. In the same way that we use these mentalistic
notions to describe and explain the behaviour of humans, so it might be useful to
use them to program machines. The idea of programming computer systems in
terms of mental states was articulated in Shoham (1993).
The first implementation of the agent-oriented programming paradigm was the
AGENT0 programming language. In this language, an agent is specified in terms
of a set of capabilities (things the agent can do), a set of initial beliefs, a set of
initial commitments, and a set of commitment rules. The key component, w h c h
Agent-Oriented Programming 55
determines how the agent acts, is the commitment rule set. Each commitment
rule contains a message condition, a mental condition, and an action. In order to
determine whether such a rule fires, the message condition is matched against
the messages the agent has received; the mental condition is matched against the
beliefs of the agent. If the rule fires, then the agent becomes committed to the
action.
Actions in Agent0 may be private, corresponding to an internally executed sub-
routine, or communicative, i.e. sending messages. Messages are constrained to be
one of three types: 'requests' or 'unrequests' to perform or refrain from actions,
and 'inform' messages, whlch pass on information (in Chapter 8, we will see that
this style of communication is very common in multiagent systems). Request and
unrequest messages typically result in the agent's commitments being modified;
inform messages result in a change to the agent's beliefs.
Here is an example of an Agent0 commitment rule:
COMMIT(
( agent, REQUEST, DO(time, a c t i o n )
) , ; ; ; msg c o n d i t i o n
( B,
[now, F r i e n d agent] AND
CAN(se1 f , a c t i on) AND
NOT [ t ime, CMT(se1 f , a n y a c t i on)]
) , ; ; ; mental c o n d i t i o n
self,
DO(time, a c t i o n ) )
This rule may be paraphrased as follows:
i f I receive a message from a g e n t which requests me to do a c t i o n at
t i m e , and I believe that
a g e n t is currently a friend;
I can do the action;
at t i m e , 1 am not committed to doing any other action,
then commit to doing a c t i o n at t i m e .
The operation of an agent can be described by the following loop (see Figure 3.4).
(1) Read all current messages, updating beliefs - and hence commitments -
where necessary.
(2) Execute all commitments for the current cycle where the capability condition
of the associated action is satisfied.
56 Deductive Reasoning Agents
I
messages in
update .$
I beliefs
u
beliefs
L 1
'
i update
\
commitments
-
commitments"
El
'l 1
abilities
\ ,
,Q messages out
+
I
internal actions
Figure 3.4 The flow of control in AgentO.
It should be clear how more complex agent behaviours can be designed and
built in AgentO. However, it is important to note that thls language is essentially a
prototype, not intended for building anything like large-scale production systems.
But it does at least give a feel for how such systems might be built.
Operator Meaning
cp is true 'tomorrow'
g, was true 'yesterday'
at some time in the future, cp
always in the future, g,
at some time in the past, p
always in the past, g,
g, will be true until r ~ /
g, has been true since ry
g, is true unless c~
g, is true zince cV
means 'it is now, and will always be true that agents are important'.
0i m p o r t a n t ( J a n i n e )
means 'sometime in the future, Janine will be important'.
means 'we are not friends until you apologize'. And, finally,
Clearly, step (3) is the heart of the execution process. Making the wrong choice at
t h s step may mean that the agent specification cannot subsequently be satisfied.
When a proposition in an agent becomes true, it is compared against that agent's
interface (see above); if it is one of the agent's component propositions, then that
proposition is broadcast as a message to all other agents. On receipt of a message,
each agent attempts to match the proposition against the environment proposi-
tions in their interface. If there is a match, then they add the proposition to their
history.
60 Deductive Reasoning Agents
r p ( a s k 1 , ask2) [ g i v e l , g i v e 2 ] :
O a s k l 3 Ggivel;
O a s k 2 =. Ggive2;
start 3 U l ( g i v e 1 A g i v e 2 ) .
r c l ( g i v e 1 )[ a s k l ] :
start =. a s k l ;
O a s k l =. a s k l .
r c 2 ( a s k l 1 g i v e 2 ) [ a s k 2 ]:
0 (askl A lask2) ask2.
- - - --
Time Agent
rP rcl rc2
0. ask1
1. ask1 ask1 ask2
2. askl,ask2,givel askl
3. askl'give2 a s k l ' g i v e l ask2
4. a s k l , a s k 2 , g i v e l ask1 give2
5. ... ... ...
Figure 3.6 An example run of Concurrent MetateM.
Figure 3.5 shows a simple system containing three agents: r p , r c l , and rc2.
The agent r p is a 'resource producer': it can 'give' to only one agent at a time,
and will commit to eventually g i v e to any agent that asks. Agent r p will only
accept messages a s k l and ask2, and can only send g i v e l and g i v e 2 messages.
The interface of agent r c l states that it will only accept g i v e l messages, and can
only send a s k l messages. The rules for agent r c 1 ensure that an as k l message
is sent on every cycle - t h s is because start is satisfied at the beginning of time,
thus firing the first rule, so O a s k l will be satisfied on the next cycle, thus fir-
ing the second rule, and so on. Thus r c l asks for the resource on every cycle,
using an a s k l message. The interface for agent r c 2 states that it will accept
both a s k l and give2 messages, and can send as k2 messages. The single rule
for agent r c 2 ensures that an ask2 message is sent on every cycle where, on its
previous cycle, it did not send an ask2 message, but received an a s k l message
(from agent r c l ) . Figure 3.6 shows a fragment of an example run of the system
in Figure 3.5.
Concurrent MetateM 61
Class reading: Shoham (1993). This is the article that introduced agent-oriented
programming and, throughout the late 1990s,was one of the most cited articles in
the agent community. The main point about the article, as far as I am concerned, is
that it explicitly articulates the idea of programming systems in terms of 'mental
states'. AgentO, the actual language described in the article, is not a language that
you would be likely to use for developing 'real' systems. A useful discussion might
be had on (i) whether 'mental states' are really useful in programming systems;
(ii) how one might go about proving or disproving the hypothesis that mental
states are useful in programming systems, and (iii)how AgentO-likefeatures might
be incorporated in a language such as Java.
Concurrent MetateM
S n o w W h i t e ( a s k ) [ g i v e ]:
Oask(x) Ogiv e ( x )
g i v e ( x )A give(?) (x=Y )
e a g e r ( g i v e )[ a s k ] :
start
Ogive(eager)
g r e e d y ( g i v e )[ a s k ] :
start
c o u r t e o u s ( g i v e )[ a s k ] :
( ( - a s k ( c o u r t e o u s )S g i v e ( e a g e r ) ) ~
( ~ a s k ( c o u r t e o u Ss )g i v e ( g r e e d y ) ) )
s h y ( g i v e ) [ a s k ]:
start
Oask(x)
Ogive(shy)
Figure 3.7 Snow White in Concurrent MetateM.
Exercises
(1)[Level 2.1 (The following few questions refer to the vacuum-world example.)
Give the full definition (using pseudo-code if desired) of the n e w function, which
defines the predicates to add to the agent's database.
( 2 ) [Level 2.1
Complete the vacuum-world example, by filling in the missing rules. How intuitive do
you think the solution is? How elegant is it? How compact is it?
( 3 ) [Level 2.1
'l'ry using your favourite (imperative) programming language to code a solution to
the basic vacuum-world example. How do you think it compares with the logical solu-
tion? What does this tell you about trying to encode essentially procedural knowledge
(i.e. knowledge about what action to perform) as purely logical rules?
(4) [Level 2.1
If you are familiar with Prolog, try encoding the vacuum-world example in this language
and running it with randomly placed dirt. Make use of the a s s e r t and r e t r a c t meta-
level predicates provided by Prolog to simplify your system (allowing the program itself
to achieve much of the operation of the next function).
( 5 ) [Level 2.1
Try scaling the vacuum world up to a 10 x 10 grid size. Approximately how many rules
would you need to encode this enlarged example, using the approach presented above?
Try to generalize the rules, encoding a more general decision-making mechanism.
64 Deductive Reasoning Agents
( 6 ) [Level 3.1
Suppose that the vacuum world could also contain obstacles, which the agent needs
to avoid. (Imagine it is equipped with a sensor to detect such obstacles.) Try to adapt
the example to deal with obstacle detection and avoidance. Again, compare a logic-based
solution with one implemented in a traditional (imperative) programming language.
(7) [Level 3.1
Suppose the agent's sphere of perception in the vacuum world is enlarged, so that it
can see the whole of its world, and see exactly where the dirt lay. In this case, it would be
possible to generate an optimal decision-making algorithm - one which cleared up the dirt
in the smallest time possible. Try and think of such general algorithms, and try to code
them both in first-order logic and a more traditional programming language. Investigate
the effectiveness of these algorithms when there is the possibility of noise in the perceptual
input the agent receives (i.e. there is a non-zero probability that the perceptual information
is wrong), and try to develop decision-making algorithms that are robust in the presence
of such noise. How do such algorithms perform as the level of perception is reduced?
(8) [Level 2.1
Consider the Concurrent MetateM program in Figure 3.7. Explain the behaviour of the
agents in this system.
(9) [Level 4.1
Extend the Concurrent MetateM language by operators for referring to the beliefs and
commitments of other agents, in the style of Shoham's AgentO.
(10) [Level 4.1
Give a formal semantics to AgentO and Concurrent MetateM.
Practical
Reasoning
Agents
Whatever the merits of agents that decide what to do by proving theorems, it
seems clear that we do not use purely logical reasoning in order to decide what
to do. Certainly something like logical reasoning can play a part, but a moment's
reflection should confirm that for most of the time, very different processes are
taking place. In this chapter, I will focus on a model of agency that takes its inspi-
ration from the processes that seem to take place as we decide what to do.
The second main property of intentions is that they persist. If I adopt an inten-
tion to become an academic, then I should persist with this intention and
attempt to achieve it. For if I immediately drop my intentions without devot-
ing any resources to achieving them, then I will not be acting rationally. Indeed,
you might be inclined to say that I never really had intentions in the first
place.
Of course, I should not persist with my intention for too long - if it becomes
clear to me that I will never become an academic, then it is only rational to drop
my intention to do so. Similarly, if the reason for having an intention goes away,
then it would be rational for me to drop the intention. For example, if I adopted
the intention to become an academic because I believed it would be an easy life,
but then discover that this is not the case (e.g. I might be expected to actually
teach!), then the justification for the intention is no longer present, and I should
drop the intention.
If I initially fail to achieve an intention, then you would expect me to try again -
you would not expect me to simply give up. For example, if my first application
for a PhD program is rejected, then you might expect me to apply to alternative
universities.
The third main property of intentions is that once I have adopted an intention,
the very fact of having this intention will constrain my future practical reasoning.
For example, while I hold some particular intention, I will not subsequently enter-
tain options that are inconsistent with that intention. Intending to write a book,
for example, would preclude the option of partying every night: the two are mutu-
ally exclusive. This is in fact a highly desirable property from the point of view
of implementing rational agents, because in providing a 'filter of admissibility',
intentions can be seen to constrain the space of possible intentions that an agent
needs to consider.
Finally, intentions are closely related to beliefs about the future. For exam-
ple, if I intend to become an academic, then I should believe that, assuming
some certain background conditions are satisfied, I will indeed become an aca-
demic. For if I truly believe that I will never be an academic, it would be non-
sensical of me to have an intention to become one. Thus if I intend to become
an academic, I should at least believe that there is a good chance I will indeed
become one. However, there is what appears at first sight to be a paradox here.
While I might believe that I will indeed succeed in achieving my intention, if I
am rational, then I must also recognize the possibility that I can fail to bring it
about - that there is some circumstance under which my intention is not satis-
fied.
Practical Reasoning Equals Deliberation Plus Means-Ends Reasoning 69
Finally, the variable I represents the agent's intentions, and I n t is the set of all
possible intentions.
In what follows, deliberation will be modelled via two functions:
an option generation function; and
a filtering function.
The signature of the option generation function options is as follows:
options : p(Be1) x p ( I n t ) - ~(Des).
This function takes the agent's current beliefs and current intentions, and on the
basis of these produces a set of possible options or desires.
In order to select between competing options, an agent uses a filter function.
Intuitively, the filter function must simply select the 'best' option(s) for the agent
to commit to. We represent the filter process through a function f i l t e r , with a
signature as follows:
f i l t e r : p(Be1) x p(Des) x p ( I n t ) - p(Int).
An agent's belief update process is modelled through a belief revision function:
bvf : p(Be1) x P e r - p(Be1).
goall
intention/ state of
task environment possible actions
planner
were a model of the world as a set of formulae of first-order logic, and a set of
action schemata, whlch describe the preconditions and effects of all the actions
available to the planning agent. This latter component has perhaps proved to
be STRIPS' most lasting legacy in the A1 planning community: nearly all imple-
mented planners employ the 'STRIPS formalism' for action, or some variant of
it. The STRIPS planning algorithm was based on a principle of finding the 'differ-
ence' between the current state of the world and the goal state, and reducing this
difference by applying an action. Unfortunately, this proved to be an inefficient
process for formulating plans, as STRIPS tended to become 'lost' in low-level plan
detail.
There is not scope in this book to give a detailed technical introduction to plan-
ning algorithms and technologies, and in fact it is probably not appropriate to do
so. Nevertheless, it is at least worth giving a short overview of the main concepts.
on first-order logic. I will use the predicates in Table 4.1 to represent the Blocks
World.
A description of the Blocks World in Figure 4.2 is using these predicates as
follows:
{Clear(A),On(A,B ) , OnTable(B),OnTable(C),Clear(C)).
I am implicitly malung use of the closed world assumption: if something is not
explicitly stated to be true, then it is assumed false.
The next issue is how to represent goals. Again, we represent a goal as a set of
formulae of first-order logic:
So the goal is that all the blocks are on the table. To represent actions, we make
use of the precondition/delete/add list notation - the STRIPS formalism. In this
formalism, each action has
a name - which may have arguments;
- a precondition list - a list of facts whch must be true for the action to be
executed;
a delete list - a list of facts that are no longer true after the action is per-
formed; and
an add list - a list of facts made true by executing the action.
The stack action occurs when the robot arm places the object x it is holding on
top of object y:
S t a c k ( x ,y )
pre { C l e a r ( y )H, olding(x)]
del { C l e a r ( y ) Holding
, (x)1
add { A r m E m p t y ,O n ( x ,y ) ]
The unstack action occurs when the robot arm picks an object x up from on
top of another object y:
Means-Ends Reasoning 73
Predicate Meaning
U n S t a c k ( x ,y )
pre { O n ( x ,y ) , C l e a r ( x ) ,A r m E m p t y )
del { O n ( x ,y ) , A v m E m p t y j
add { H o l d i n g ( x ) , C l e a r ( y ) }
The pickup action occurs when the arm picks up an object x from the table:
Pickup(x)
pre { C l e a r ( x ) ,O n T a b l e ( x ) ,A r m E m p t y )
del { O n T a b l e ( x ) ,A r m E m p t y }
add ( H o l d i n g ( x )1
The putdown action occurs when the arm places the object x onto the table:
PutDown(x)
pre { H o l d i n g ( x ) )
del { H o l d i n g ( x )j-
add ( A r m E m p t y ,O n T a b l e ( x ) }
Let us now describe what is going on somewhat more formally. First, as we have
throughout the book, we assume a fixed set of actions Ac = { a l , .. . , a,} that the
agent can perform. A descriptor for an action a E A c is a triple
where
Pa is a set of formulae of first-order logic that characterize the precondition
of action a:
D , is a set of formulae of first-order logic that characterize those facts made
false by the performance of a (the delete list); and
A, is a set of formulae of first-order logic that characterize those facts made
true by the performance of a (the add list).
74 Practical Reasoning Agents
For simplicity, we will assume that the precondition, delete, and add lists are
constrained to only contain ground atoms - individual predicates, which do not
contain logical connectives or variables.
A planning problem (over the set of actions Ac) is then determined by a triple
where
A is the beliefs of the agent about the initial state of the world - these beliefs
will be a set of formulae of first order (cf. the vacuum world in Chapter 2);
0 = {(P,, D,, A,) I a E A c ) is an indexed set of operator descriptors, one
for each available action a; and
y is a set of formulae of first-order logic, representing the goal/task/
intention to be achieved.
A plan rr is a sequence of actions
where
no = A
and
if .rr is a plan, then we write p r e ( ~ to) denote the precondition of .rr, and
body (rr) to denote the body of rr;
if 7~ is a plan, then we write empty(.rr) to mean that plan 7~ is the empty
sequence (thus e m p t y ( . . .) is a Boolean-valued function);
execute(. . .) is a procedure that takes as input a single plan and executes
it without stopping - executing a plan simply means executing each action
in the plan body in turn;
if rr is a plan, then by head(7r) we mean the plan made up of the first action
in the plan body of 7 ~ for ; example, if the body of 7~ is 0 1 , . . . , a n ,then the
body of head(rr) contains only the action a , ;
if rr is a plan, then by tail(7r) we mean the plan made up of all but the first
action in the plan body of IT; for example, if the body of rr is a l , 0 2 , . . . , a , ,
then the body of t a i l ( n ) contains actions a2,. . . , a,;
if .rr is a plan, I c I n t is a set of intentions, and B c Bel is a set of beliefs,
then we write s o u n d ( r r ,I, B) to mean that rr is a correct plan for intentions
I given beliefs B (Lifschtz, 1986).
An agent's means-ends reasoning capability is represented by a function
-
p l a n : p(Bel) x ~ ( l n tx) ~ ( A c ) P l a n ,
whch, on the basis of an agent's current beliefs and current intentions, deter-
mines a plan to achieve the intentions.
Notice that there is nothing in the definition of the p l a n ( . . .) function which
requires an agent to engage in plan generation - constructing a plan from scratch
(Allen et al., 1990).In many implemented practical reasoning agents, the p l a n ( . . .)
function is implemented by giving the agent a plan library (Georgeff and Lan-
sky, 1987). A plan library is a pre-assembled collection of plans, which an agent
designer gives to an agent. Finding a plan to achieve an intention then simply
involves a single pass through the plan library to find a plan that, when executed,
will have the intention as a postcondition, and will be sound given the agent's
current beliefs. Preconditions and postconditions for plans are often represented
as (lists of) atoms of first-order logic, and beliefs and intentions as ground atoms
of first-order logic. Finding a plan to acheve an intention then reduces to finding
a plan whose precondition unifies with the agent's beliefs, and whose postcon-
dition unifies with the intention. At the end of this chapter, we will see how this
idea works in the PRS system.
The mechanism an agent uses to determine when and how to drop intentions is
known as a commitment strategy. The following three commitment strategies are
commonly discussed in the literature of rational agents (Rao and Georgeff,1991b).
Blind commitment. A blindly committed agent will continue to maintain an inten-
tion until it believes the intention has actually been achieved. Blind commitment
is also sometimes referred to as fanatical commitment.
Single-minded commitment. A single-minded agent will continue to maintain an
intention until it believes that either the intention has been achieved, or else
that it is no longer possible to achieve the intention.
Open-minded commitment. An open-minded agent will maintain an intention as
long as it is still believed possible.
Note that an agent has commitment both to ends (i.e. the state of affairs it wishes
to bring about) and means (i.e. the mechanism via whch the agent wishes to
acheve the state of affairs).
78 Practical Reasoning Agents
With respect to commitment to means (i.e. plans), the solution adopted in Fig-
ure 4.3 is as follows. An agent will maintain a commitment to an intention until
(i) it believes the intention has succeeded; (ii) it believes the intention is impos-
sible, or (iii) there is nothing left to execute in the plan. This is single-minded
commitment. I write succeeded(1, B) to mean that given beliefs B, the intentions
I can be regarded as having been satisfied. Similarly, we write impossible (I,B)
to mean that intentions I are impossible given beliefs B. The main loop, capturing
this commitment to means, is in lines (10)-(23).
How about commitment to ends? When should an agent stop to reconsider its
intentions? One possibility is to reconsider intentions at every opportunity - in
particular, after executing every possible action. If option generation and filtering
were computationally cheap processes, then this would be an acceptable strat-
egy. Unfortunately, we know that deliberation is not cheap - it takes a consider-
able amount of time. While the agent is deliberating, the environment in which
the agent is working is changing, possibly rendering its newly formed intentions
irrelevant.
We are thus presented with a dilemma:
an agent that does not stop to reconsider its intentions sufficiently often
will continue attempting to achieve its intentions even after it is clear that
they cannot be achieved, or that there is no longer any reason for acheving
them;
an agent that constantly reconsiders its attentions may spend insufficient
time actually working to achieve them, and hence runs the risk of never
actually achieving them.
There is clearly a trade-off to be struck between the degree of commitment and
reconsideration at work here. To try to capture this trade-off, Figure 4.3 incor-
porates an explicit meta-level control component. The idea is to have a Boolean-
valued function, r e c o n s i d e r , such that r e c o n s i d e r ( I , B) evaluates to 'true' just
in case it is appropriate for the agent with beliefs B and intentions 1 to recon-
sider its intentions. Deciding whether to reconsider intentions thus falls to this
function.
It is interesting to consider the circumstances under which this function can
be said to behave optimally. Suppose that the agent's deliberation and plan gen-
eration functions are in some sense perfect: that deliberation always chooses the
'best' intentions (however that is defined for the application at hand), and planning
always produces an appropriate plan. Further suppose that time expended always
has a cost - the agent does not benefit by doing notlung. Then it is not difficult
to see that the function r e c o n s i d e r (. . .) will be behaving optimally if, and only
if, whenever it chooses to deliberate, the agent changes intentions (Wooldridge
and Parsons, 1999). For if the agent chose to deliberate but did not change inten-
tions, then the effort expended on deliberation was wasted. Similarly, if an agent
Implementing a Practical Reasoning Agent 79
Table 4.2 Practical reasoning situations (cf. Bratman et al., 1988,p. 353).
1. No - No Yes
2. No -
Yes No
3. 1-es No - No
4. Yes Yes - Yes
should have changed intentions, but failed to do so, then the effort expended on
attempting to achieve its intentions was also wasted.
The possible interactions between delibera~ionand meta-level control (the func-
tion r e c o n s i d e r ( .. .)) are summarized in Table 4.2.
In situation (I),the agent did not choose to deliberate, and as a consequence,
did not choose to change intentions. Moreover, if it had chosen to deliberate,
it would not have changed intentions. In this situation, the r e c o n s i d e r ( .. .?
function is behaving optimally.
In situation (Z),the agent did not choose to deliberate, but if it had done
so, it would have changed intentions. In this situation, the r e c o n s i d e r ( .. .)
function is not behaving optimally.
- In situation (3), the agent chose to deliberate, but did not change intentions.
In this situation, the r e c o n s i d e r ( . . .) function is not behaving op tirnally.
In situation (4), the agent chose to deliberate, and did change intentions. In
t h s situation, the r e c o n s i d e r (. . .) function is behaving optimally.
Notice that there is an important assumption implicit within this discussion: that
the cost of executing the r e c o n s i d e r ( .. .) function is much less than the cost
of the deliberation process itself. Otherwise, the r e c o n s i d e r ( .. .) function could
simply use the deliberation process as an oracle, running it as a subroutine and
choosing to deliberate just in case the deliberation process changed intentions.
The nature of the trade-off was examined by David Kinny and Michael Georgeff
in a number of experiments carried out using a BD1 agent system (Kinny and
Georgeff, 1991). The aims of Kinny and Georgeff's investigation were to
(1)assess the feasibility of experimentally measuring agent effective-
ness in a simulated environment (2) investigate how commitment to
goals contributes to effective agent behaviour and (3) compare the
properties of different strategies for reacting to change.
(Kinny and Georgeff, 1991, p. 82)
before their current plan is fully executed; and cautious agents, which stop to
reconsider after the execution of every action. These characteristics are defined
by a degree of boldness, which specifies the maximum number of plan steps the
agent executes before reconsidering its intentions. Dynamism in the environment
is represented by the rate of environment change. Put simply, the rate of environ-
ment change is the ratio of the speed of the agent's control loop to the rate of
change of the environment. If the rate of world change is 1, then the environment
will change no more than once for each time the agent can execute its control
loop. If the rate of world change is 2, then the environment can change twice for
each pass through the agent's control loop, and so on. The performance of an
agent is measured by the ratio of number of intentions that the agent managed
to achieve to the number of intentions that the agent had at any time. Thus if
effectiveness is 1, then the agent achieved all its intentions. If effectiveness is 0,
then the agent failed to achieve any of its intentions. The key results of Kinny and
Georgeff were as follows.
If the rate of world change is low (i.e. the environment does not change
quickly), then bold agents do well compared with cautious ones. This is
because cautious ones waste time reconsidering their commitments while
bold agents are busy worlung towards - and acheving - their intentions.
If the rate of world change is high (i.e. the environment changes frequently),
then cautious agents tend to outperform bold agents. This is because they
are able to recognize when intentions are doomed, and also to take advan-
tage of serendipitous situations and new opportunities when they arise.
The bottom line is that different environment types require different intention
reconsideration and commitment strategies. In static environments, agents that
are strongly committed to their intentions will perform well. But in dynamic envi-
ronments, the ability to react to changes by modifying intentions becomes more
important, and weakly committed agents will tend to outperform bold agents.
Interpreter
t
planner
reflective
memory
I I
I
2 -
human director
Seaworld),and can then execute its plans, modifymg them as required during exe-
cution. The agent has a limited episodic memory, and using this, is able to answer
questions about its past experiences. The architecture of HOMER is illustrated in
Figure 4.4.
Here is an example of a dialogue with HOMER: the user begins by asking HOMER
what he sees in front of him.
USER: What i s i n f r o n t o f you?
HOMER: A l o g .
USER: Do you own t h e l o g .
HOMER: No, I d o n ' t .
The user then tells HOMER that he owns the log; t h s is new information, and so
HOMER expresses surprise.
USER: The l o g belongs t o you.
HOMER: Oh.
At this point, the user tells HOMER to move, so that he can no longer see the log,
and moves the log while it is hidden from HOMER'S sight. When he turns around,
HOMER expects to see the log, and expresses surprise.
USER: Turn away from your l o g .
HOMER: OK, I am t u r n i n g .
[LOG I S MOVED]
USER: Turn around.
HOMER: OK, I ' m t u r n i n g .
HOMER: I ' v e l o s t t h e l o g !
82 Practical Reasoning Agents
I ENVIRONMENT
action output
plans about plans. They are able to modify an agent's intention structures at run-
time, in order to change the focus of the agent's practical reasoning. However, a
simpler method is to use utilities for plans. These are numerical values; the agent
simply chooses the plan that has the highest utility.
The chosen plan is then executed in its turn; t h s may involve pushing further
goals onto the intention stack, which may then in turn involve finding more plans
to acheve these goals, and so on. The process bottoms out with individual actions
that may be directly computed ( e g simple numerical calculations). If a particular
plan to achieve a goal fails, then the agent is able to select another plan to achieve
this goal from the set of all candidate plans.
To illustrate all this, Figure 4.6 shows a fragment of a Jam system (Huber, 1999).
Jam is a second-generation descendant of the PRS, implemented in Java. The basic
ideas are identical. The top level goal for this system, which is another Blocks
World example, is to have achieved the goal bl ocks-stacked. The initial beliefs
of the agent are spelled out in the FACTS section. Expressed in conventional logic
notation, the first of these is On(Block5,Block4), i.e. 'block 5 is on top of block
4'.
The system starts by pushng the goal b l ocks-stacked onto the intention
stack. The agent must then find a candidate plan for this; there is just one plan
that has this goal as a GOAL: the 'top level plan'. The context of t h s plan is empty,
that is to say, true, and so this plan can be directly executed. Executing the body
of the plan involves pushing the following goal onto the intention stack:
On(block3, table).
This is immediately achieved, as it is a FACT. The second sub-goal is then posted:
To achieve this, the 'stack blocks that are clear' plan is used; the first sub-goals
involve clearing both block2 and block3, which in turn will be done by two invo-
cations of the 'clear a block' plan. When this is done, the move action is directly
invoked to move block2 onto block3.
I leave the detailed behaviour as an exercise.
COALS :
ACHIEVE blocks-stacked;
FACTS :
FACT ON "Block5" "Block4"; FACT ON "Block4" "Block3";
FACT ON " B l o c k l " "Block2"; FACT ON "Block2" "Tab1 e" ;
FACT ON "Block3" "Table"; FACT CLEAR "Bl o c k l " ;
FACT CLEAR "BlockS"; FACT CLEAR "Table";
Plan: {
NAME : "Top-1 eve1 p l a n "
COAL: ACHIEVE blocks-stacked;
CONTEXT :
BODY : ACHIEVE ON "Block3" "Table";
ACHIEVE ON "Block2" "Block3";
ACHIEVE ON " B l o c k l " "Block2";
1
Plan: {
NAME: "Stack blocks t h a t a r e a l r e a d y c l e a r "
COAL: ACHIEVE ON BOB31 $OBJ2;
CONTEXT :
BODY : ACHIEVE CLEAR 80Bl1;
ACHIEVE CLEAR SOB12;
PERFORM move BOB31 80BJ2;
UTILITY: 10;
FAILURE: EXECUTE p r i n t "\n\nStack blocks f a i l e d ! \ n \ n W ;
1
Plan: I
NAME: "Clear a b l o c k "
COAL: ACHIEVE CLEAR $OBI;
CONTEXT: FACT ON $OBI2 $OBI;
BODY : ACHIEVE ON $OBI2 "Table";
EFFECTS: RETRACT ON BOB32 $OBI;
FAILURE: EXECUTE p r i n t " \ n \ n C l e a r i n g b l o c k f a i l e d ! \ n \ n n ;
the role of intentions in practical reasoning. The conceptual framework of the BDI
model is described in Bratman et al. (1988),which also describes a specific BDI
agent architecture called IRMA.
The best-known implementation of the BDI model is the PRS system, developed
by Georgeff and colleagues (Georgeff and Lansky, 1987; Georgeff and Ingrand,
1989). The PRS has been re-implemented several times since the mid-1980s, for
example in the Australian AI Institute's DMARS system (dlInvernoet al., 1997),the
University of Michigan's C++ implementation UM-PRS, and a Java version called
86 Practical Reasoning Agents
Exercises
(1) [Level 1 .]
Imagine a mobile robot, capable of moving around an office environment. Ultimately,
this robot must be controlled by very low-level instructions along the lines of 'motor on',
and so on. How easy would it be to develop STRIPS operators to represent these properties?
Try it.
( 2 ) [Level 2.1
Recall the vacuum-world example discussed in the preceding chapter. Formulate the
operations available to the agent using the STRIPS notation.
(3) [Level 2.1
Consider an agent that must move from one location to another, collecting items from
one site and moving them. The agent is able to move by taxi, bus, bicycle, or car.
Formalize the operations available to the agent (move by taxi, move by car, etc.) using
the STRIPS notation. (Hint: preconditions might be having money or energy.)
(4) [Level 3.1
Read Kinny and Georgeff (1991),and implement these experiments in the programming
language of your choice. (This is not as difficult as its sounds: it should be possible in a
couple of days at most.) Now carry out the experiments described in Kinny and Georgeff
(1991)and see if you get the same results.
( 5 ) [Level 3.1
Building on the previous question, investigate the following.
The effect that reducing perceptual capabilities on agent performance. The idea here
is to reduce the amount of environment that the agent can see, until it can finally see only
the grid square on which it is located. Can 'free' planning compensate for the inability
to see very far?
The effect of non-deterministicactions. If actions are allowed to become non-determin-
istic (so that in attempting to move from one grid square to another, there is a certain
probability that the agent will in fact move to an entirely different grid square), what
effect does this have on the effectiveness of an agent?
Reactive and
Hybrid Agents
The many problems with symbolic/logical approaches to building agents led some
researchers to question, and ultimately reject, the assumptions upon which such
approaches are based. These researchers have argued that minor changes to the
symbolic approach, such as weakening the logical representation language, will
not be sufficient to build agents that can operate in time-constrained environ-
ments: nothng less than a whole new approach is required. In the mid to late
1980s, these researchers began to investigate alternatives to the symbolic A1
paradigm. It is difficult to neatly characterize these different approaches, since
their advocates are united mainly by a rejection of symbolic AI, rather than by a
common manifesto. However, certain themes do recur:
the rejection of symbolic representations, and of decision makmg based on
syntactic manipulation of such representations;
the idea that intelligent, rational behaviour is seen as innately linked to the
environment an agent occupies - intelligent behaviour is not disembodied,
but is a product of the interaction the agent maintains with its environment;
the idea that intelligent behaviour emerges from the interaction of various
simpler behaviours.
Alternative approaches to agency are sometime referred to as behavioural (since
a common theme is that of developing and combining individual behaviours), sit-
uated (since a common theme is that of agents actually situated in some environ-
ment, rather than being disembodied from it), and finally - the term used in this
90 Reactive and Hybrid Agents
chapter - reactive (because such systems are often perceived as simply reacting
to an environment, without reasoning about it).
Function: A c t i o n S e l e c t i o n i n t h e Subsumption A r c h i t e c t u r e
1. f u n c t i o n action(p : P ) : A
2 . var f i r e d : @ ( R )
3 . v a r selected : A
4 . begi n
5. f i r e d - { ( c , a )I ( c , a ) E R and p E C ]
6. f o r each ( c , a ) E fired do
7. i f 7 ( 3 ( c ' , a ' ) E fired such t h a t (c1,a')< ( c , a ) ) t h e n
8. return a
9. end-i f
10. end-for
11. r e t u r n null
1 2 . end f u n c t i o n action
Figure 5.1 Action Selection in the subsumption architecture.
behaviours arranged into layers. Lower layers in the herarchy are able to inhibit
higher layers: the lower a layer is, the hgher is its priority. The idea is that
higher layers represent more abstract behaviours. For example, one might desire
a behaviour in a mobile robot for the behaviour 'avoid obstacles'. It makes sense
to give obstacle avoidance a high priority - hence t h s behaviour will typically be
encoded in a low-levellayer, whch has high priority. To illustrate the subsumption
architecture in more detail, we will now present a simple formal model of it, and
illustrate how it works by means of a short example. We then discuss its relative
advantages and shortcomings, and point at other similar reactive architectures.
The see function, whch represents the agent's perceptual ability, is assumed to
remain unchanged. However, in implemented subsumption architecture systems,
there is assumed to be quite tight coupling between perception and action - raw
sensor input is not processed or transformed much, and there is certainly no
attempt to transform images to symbolic representations.
The decision function a c t i o n is realized through a set of behaviours, together
with an inhibition relation holding between these behaviours. A behaviour is a pair
(c,a ) ,where c G P is a set of percepts called the condition, and a E A is an action.
A behaviour (c,a ) will fire when the environment is in state s E S if and only if
see(s) E c. Let Beh = (c,a ) I c E P and a E A) be the set of all such rules.
Associated with an agent's set of behaviour rules R c Beh is a binary inhibition
relation on the set of behaviours: iG R x R. T h s relation is assumed to be a strict
total ordering on R (i.e. it is transitive, irreflexive, and antisymmetric). We write
bl + b2 if (bl, b2) E+, and read this as 'bl inhibits bz', that is, bl is lower in the
hierarchy than b2,and will hence get priority over b2. The action function is then
as shown in Figure 5.1.
Thus action selection begins by first computing the set f i r e d of all behaviours
that fire ( 5 ) .Then, each behaviour (c, a ) that fires is checked, to determine whether
there is some other hgher priority behaviour that fires. If not, then the action part
of the behaviour, a, is returned as the selected action (8). If no behaviour fires,
92 Reactive and Hybrid Agents
then the distinguished action null will be returned, indicating that no action has
been chosen.
Given that one of our main concerns with logic-based decision malung was
its theoretical complexity, it is worth pausing to examine how well our simple
behaviour-based system performs. The overall time complexity of the subsump-
tion action function is no worse than 0 (n", where n is the larger of the number
of behaviours or number of percepts. Thus, even with the naive algorithm above,
decision malung is tractable. In practice, we can do much better than this: the
decision-malung logic can be encoded into hardware, giving constant decision
time. For modern hardware, t h s means that an agent can be guaranteed to select
an action within microseconds. Perhaps more than anything else, this computa-
tional simplicity is the strength of the subsumption architecture.
The subsumption herarchy for t h s example ensures that, for example, an agent
will always turn if any obstacles are detected; if the agent is at the mother s h p
and is carrying samples, then it will always drop them if it is not in any immediate
danger of crashmg, and so on. The 'top level' behaviour - a random walk - will only
ever be carried out if the agent has nothing more urgent to do. It is not difficult to
see how t h s simple set of behaviours will solve the problem: agents will search
for samples (ultimately by searchng randomly), and when they find them, will
return them to the mother s h p .
If the samples are distributed across the terrain entirely at random, then equip-
ping a large number of robots with these very simple behaviours will work
extremely well. But we know from the problem specification, above, that this is
not the case: the samples tend to be located in clusters. In t h s case, it makes
sense to have agents cooperate with one another in order to find the samples.
94 Reactive and Hybrid Agents
Thus when one agent finds a large sample, it would be helpful for it to communi-
cate this to the other agents, so they can help it collect the rocks. Unfortunately,
we also know from the problem specification that direct communication is impos-
sible. Steels developed a simple solution to this problem, partly inspired by the
foraging behaviour of ants. The idea revolves around an agent creating a 'trail'
of radioactive crumbs whenever it finds a rock sample. The trail will be created
when the agent returns the rock samples to the mother ship. If at some later point,
another agent comes across this trail, then it need only follow it down the gradient
field to locate the source of the rock samples. Some small refinements improve
the efficiency of this ingenious scheme still further. First, as an agent follows a
trail to the rock sample source, it picks up some of the crumbs it finds, hence
making the trail fainter. Secondly, the trail is only laid by agents returning to the
mother s h p . Hence if an agent follows the trail out to the source of the nominal
rock sample only to find that it contains no samples, it will reduce the trail on the
way out, and will not return with samples to reinforce it. After a few agents have
followed the trail to find no sample at the end of it, the trail will in fact have been
removed.
The modified behaviours for t h s example are as follows. Obstacle avoidance
(5.1)remains unchanged. However, the two rules determining what to do if carry-
ing a sample are modified as follows:
if carrying samples and at the base then drop samples; (5.6)
if carrying samples and not at the base
then drop 2 crumbs and travel up gradient. (5.7)
The behaviour (5.7) requires an agent to drop crumbs when returning to base
with a sample, thus either reinforcing or creating a trail. The 'pick up sample'
behaviour (5.4)remains unchanged. However, an additional behaviour is required
for dealing with crumbs:
if sense crumbs then pick up 1 crumb and travel down gradient. (5.8)
Finally, the random movement behaviour (5.5) remains unchanged. These be-
haviour are then arranged into the following subsumption hierarchy:
he reported the theoretical difficulties with planning described above, and was
coming to similar conclusions about the inadequacies of the symbolic AI model
hmself. Together with h s co-worker Agre, he began to explore alternatives to the
AI planning paradigm (Chapman and Agre, 1986).
Agre observed that most everyday activity is 'routine' in the sense that it
requires little - if any - new abstract reasoning. Most tasks, once learned, can
be accomplished in a routine way, with little variation. Agre proposed that an
efficient agent architecture could be based on the idea of 'running arguments'.
Crudely, the idea is that as most decisions are routine, they can be encoded into a
low-level structure (such as a digital circuit), whch only needs periodic updating,
perhaps to handle new lunds of problems. His approach was illustrated with the
celebrated PENGI system (Agre and Chapman, 1987). PENGI is a simulated corn-
puter game, with the central character controlled using a scheme such as that
outlined above.
An agent is specified in terms of two components: perception and action. Two pro-
grams are then used to synthesize agents: RULER is used to specify the perception
component of an agent; GAPPS is used to specify the action component.
RULER takes as its input three components as follows.
[A] specification of the semantics of the [agent's] inputs ('whenever
bit 1 is on, it is raining'); a set of static facts ('whenever it is raining,
the ground is wet'); and a specification of the state transitions of the
world ('if the ground is wet, it stays wet until the sun comes out'). The
programmer then specifies the desired semantics for the output ('if this
bit is on, the ground is wet'), and the compiler. . .[synthesizes] a circuit
96 Reactive and Hybrid Agents
whose output will have the correct semantics. .. . All that declarative
'knowledge' has been reduced to a very simple circuit.
(Kaelbling, 1991, p. 86)
The GAPPS program takes as its input a set of goal reduction rules (essentially
rules that encode information about how goals can be acheved) and a top level
goal, and generates a program that can be translated into a digital circuit in order
to realize the goal. Once again, the generated circuit does not represent or manip-
ulate symbolic expressions; all symbolic manipulation is done at compile time.
The situated automata paradigm has attracted much interest, as it appears to
combine the best elements of both reactive and symbolic declarative systems.
However, at the time of writing, the theoretical limitations of the approach are
not well understood; there are similarities with the automatic synthesis of pro-
grams from temporal logic specifications, a complex area of much ongoing work
in mainstream computer science (see the comments in Emerson (1990)).
action
output
ih
perceptual action
4
I
Layer n
I l+---+
Layer n
input output
Typically, there will be at least two layers, to deal with reactive and proactive
behaviours, respectively. In principle, there is no reason why there should not be
many more layers. It is useful to characterize such architectures in terms of the
information and control flows within the layers. Broadly speakmg, we can identify
two types of control flow within layered archtectures as follows (see Figure 5.2).
Horizontal layering. In horizontally layered architectures (Figure 5.2(a)),the soft-
ware layers are each directly connected to the sensory input and action output.
In effect, each layer itself acts like an agent, producing suggestions as to what
action to perform.
Vertical layering. In vertically layered architectures (see parts (b) and (c) of Fig-
ure 5.2), sensory input and action output are each dealt with by at most one
layer.
The great advantage of horizontally layered archtectures is their conceptual sim-
plicity: if we need an agent to exhbit n different types of behaviour, then we
implement n different layers. However, because the layers are each in effect com-
peting with one another to generate action suggestions, there is a danger that the
overall behaviour of the agent will not be coherent. In order to ensure that hor-
izontally layered architectures are consistent, they generally include a mediator
function, which makes decisions about w h c h layer has 'control' of the agent at
any given time. The need for such central control is problematic: it means that
the designer must potentially consider all possible interactions between layers.
If there are n layers in the archtecture, and each layer is capable of suggesting
m possible actions, then this means there are mn such interactions to be consid-
ered. This is clearly difficult from a design point of view in any but the most simple
system. The introduction of a central control system also introduces a bottleneck
into the agent's decision making.
Hybrid Agents 99
Y -
Perception subsystem Planning Layer Action subsystem
action
Reactive layer output v
s Control subsystem
TouringMachines
The TouringMachnes architecture is illustrated in Figure 5.3. As this figure shows,
TouringMachines consists of three activity producing layers. That is, each layer
continually produces 'suggestions' for what actions the agent should perform.
100 Reactive and Hybrid Agents
rules can either suppress sensor information between the control rules and the
control layers, or else censor action outputs from the control layers. Here is an
example censor rule (Ferguson, 1995, p. 207):
censor-rule-1:
if
e n t i ty(obstac1e-6) i n p e r c e p t i o n - b u f f e r
then
remove-sensory-record(1ayer-R, e n t i ty(obstac1e-6))
T h s rule prevents the reactive layer from ever knowing about whether
o b s t a c l e-6 has been perceived. The intuition is that although the reactive layer
will in general be the most appropriate layer for dealing with obstacle avoidance,
there are certain obstacles for which other layers are more appropriate. This rule
ensures that the reactive layer never comes to know about these obstacles.
InteRRaP
InteRRaP is an example of a vertically layered two-pass agent architecture - see
Figure 5.4. As Figure 5.4 shows, InteRRaP contains three control layers, as in
TouringMachines. Moreover, the purpose of each InteRRaP layer appears to be
rather similar to the purpose of each corresponding TouringMachnes layer. Thus
the lowest (behaviour-based)layer deals with reactive behaviour; the middle (local
planning) layer deals with everyday planning to acheve the agent's goals, and the
uppermost (cooperative planning) layer deals with social interactions. Each layer
has associated with it a knowledge base, i.e. a representation of the world appro-
priate for that layer. These different knowledge bases represent the agent and
102 Reactive and Hybrid Agents
its environment at different levels of abstraction. Thus the highest level knowl-
edge base represents the plans and actions of other agents in the environment;
the middle-level knowledge base represents the plans and actions of the agent
itself; and the lowest level knowledge base represents 'raw' information about the
environment. The explicit introduction of these knowledge bases distinguishes
TouringMachnes from InteRRaP.
The way the different layers in InteRRaP conspire to produce behaviour is also
quite different from TouringMachnes. The main difference is in the way the layers
interact with the environment. In TouringMachnes, each layer was directly cou-
pled to perceptual input and action output. This necessitated the introduction
of a supervisory control framework, to deal with conflicts or problems between
layers. In InteRRaP, layers interact with each other to achieve the same end. The
two main types of interaction between layers are bottom-up activation and top-
down execution. Bottom-up activation occurs when a lower layer passes control to
a higher layer because it is not competent to deal with the current situation. Top-
down execution occurs when a higher layer makes use of the facilities provided
by a lower layer to achieve one of its goals. The basic flow of control in InteRRaP
begins when perceptual input arrives at the lowest layer in the archtecture. If the
reactive layer can deal with this input, then it will do so; otherwise, bottom-up
activation will occur, and control will be passed to the local planning layer. If the
local planning layer can handle the situation, then it will do so, typically by mak-
ing use of top-down execution. Otherwise, it will use bottom-up activation to pass
control to the hghest layer. In this way, control in InteRRaP will flow from the
lowest layer to hgher layers of the archtecture, and then back down again.
The internals of each layer are not important for the purposes of this chapter.
However, it is worth noting that each layer implements two general functions.
The first of these is a situation recognition and goal activation function. It maps a
knowledge base (one of the three layers) and current goals to a new set of goals.
The second function is responsible for planning and scheduling - it is responsi-
ble for selecting w h c h plans to execute, based on the current plans, goals, and
knowledge base of that layer.
Layered architectures are currently the most popular general class of agent
architecture available. Layering represents a natural decomposition of function-
ality: it is easy to see how reactive, proactive, social behaviour can be generated by
the reactive, proactive, and social layers in an architecture. The main problem with
layered archtectures is that while they are arguably a pragmatic solution, they
lack the conceptual and semantic clarity of unlayered approaches. In particular,
while logic-based approaches have a clear logical semantics, it is difficult to see
how such a semantics could be devised for a layered archtecture. Another issue
is that of interactions between layers. If each layer is an independent activity-
producing process (as in TouringMachnes), then it is necessary to consider all
possible ways that the layers can interact with one another. This problem is partly
alleviated in two-pass vertically layered architecture such as InteRRaP.
Hybrid Agents 103
Exercises
(1) [Level 2.1
Develop a solution to the vacuum-world example described in Chapter 3 using Brooks's
subsumption architecture. How does it compare with the logic-based example?
(2) [Level 2.1
Try developing a solution to the Mars explorer example using the logic-based approach
described in Chapter 3. How does it compare with the reactive solution?
(3) [Level 3.1
In the programming language of your choice, implement the Mars explorer example
using the subsumption architecture. (To do this, you may find it useful to implement a
simple subsumption architecture 'shell' for programming different behaviours.) Investi-
gate the performance of the two approaches described, and see if you can do better.
(4) [Level 3.1
Using the simulator implemented for the preceding question, see what happens as you
increase the number of agents. Eventually, you should see that overcrowding leads to a
sub-optimal solution - agents spend too much time getting out of each other's way to get
any work done. Try to get around this problem by allowing agents to pass samples to each
other, thus implementing chains. (See the description in Ferber (1996, p. 305).)
Multiagent
Interactions
So far in t h s book, we have been focusing on the problem of how to build an
individual agent. Except in passing, we have not examined the issues associated
in putting these agents together. But there is a popular slogan in the multiagent
systems community:
There's no such thing as a single agent system.
The point of the slogan is that interacting systems, whch used to be regarded as
rare and unusual beasts, are in fact the norm in the everyday computing world.
All but the most trivial of systems contains a number of sub-systems that must
interact with one another in order to successfully carry out their tasks. In this
chapter, I will start to change the emphasis of the book, from the problem of
'how to build an agent', to 'how to build an agent society'. I begin by defining
what we mean by a multiagent system.
Figure 6.1 (from Jennings (2000))illustrates the typical structure of a multiagent
system. The system contains a number of agents, which interact with one another
through communication. The agents are able to act in an environment; different
agents have different 'spheres of influence', in the sense that they will have control
over - or at least be able to influence - different parts of the environment. These
spheres of influence may coincide in some cases. The fact that these spheres
of influence may coincide may give rise to dependency relationships between the
agents. For example, two robotic agents may both be able to move through a door -
but they may not be able to do so simultaneously. Finally, agents will also typically
be linked by other relationshps. Examples might be 'power' relationshps, where
one agent is the 'boss' of another.
106 Multiagent Interactions
KEY Environment
0 agent
The most important lesson of this chapter - and perhaps one of the most
important lessons of multiagent systems generally - is that when faced with what
appears to be a multiagent domain, it is critically important to understand the
type of interaction that takes place between the agents. To see what I mean by
this, let us start with some notation.
that the agents have preferences over. To make this concrete, just think of these
as outcomes of a game that the two agents are playing.
We formally capture the preferences that the two agents have by means of
utility functions, one for each agent, whch assign to every outcome a real number,
indicating how 'good' the outcome is for that agent. The larger the number the
better from the point of view of the agent with the utility function. Thus agent i's
preferences will be captured by a function
(Compare with the discussion in Chapter 2 on tasks for agents.) It is not difficult
to see that these utility function lead to a preference ordering over outcomes. For
example, if w and w f are both possible outcomes in R , and u i ( w ) 3 u i ( w ' ) ,then
outcome w is preferred by agent i at least as much as cur. We can introduce a bit
more notation to capture this preference ordering. We write
as an abbreviation for
U~(CU) 2ui(wf).
Similarly, if ui ( w ) > ui ( c u r ) ,then outcome w is strictly preferred by agent i over
w ' . We write
UJ > i W'
as an abbreviation for
u ~ ( w>) u i ( w f )
In other words,
We can see that the relation k i really is an ordering, over R, in that it has the
following properties.
Reflexivity: for all w E R , we have that co k i w .
Transitivity: if w k i w f ,and w f 2i w " , then w k i w " .
Comparability: for all w E R and w' E R we have that either w k iw' or w ' ki
w.
The strict preference relation will satisfy the second and thud of these properties,
but will clearly not be reflexive.
108 Multiagent Interactions
utility
What is utility?
Undoubtedly the simplest way to t h n k about utilities is as money; the more
money, the better. But resist the temptation to thnk that t h s is all that utili-
ties are. Utility functions are just a way o f representing an agent's preferences.
They do not simply equate to money.
To see what I mean by this, suppose (and t h s really is a supposition) that I have
US$SOO million in the bank, whle you are absolutely penniless. A rich benefactor
appears, with one million dollars, which he generously wishes to donate to one of
us. If the benefactor gives the money to me, what will the increase in the utility of
my situation be? Well, I have more money, so there will clearly be some increase in
the utility of my situation. But there will not be much: after all, there is not much
that you can do with US$501 million that you cannot do with US$ SO0 million.
In contrast, if the benefactor gave the money to you, the increase in your utility
would be enormous; you go from having no money at all to being a millionaire.
That is a big difference.
This works the other way as well. Suppose I am in debt to the tune of US$500
million; well, there is frankly not that much difference in utility between owing
US$ SO0 million and owing US$499 million; they are both pretty bad. In contrast,
there is a very big difference between being US$1 million in debt and not being
in debt at all. A graph of the relationshp between utility and money is shown in
Figure 6.2.
In t h s environment, it does not matter what the agents do: the outcome will be the
same. Neither agent has any influence in such a scenario. We can also consider an
environment that is only sensitive to the actions performed by one of the agents.
In t h s environment, it does not matter what agent i does: the outcome depends
solely on the action performed by j . If j chooses to defect, then outcome w l will
result; if j chooses to cooperate, then outcome w2 will result.
The interesting story begins when we put an environment together with the
preferences that agents have. To see what I mean by t h s , suppose we have the
most general case, characterized by (6.1),where both agents are able to exert some
influence over the environment. Now let us suppose that the agents have utility
functions defined as follows:
110 Multiagent Interactions
Since we know that every different combination of choices by the agents are
mapped to a different outcome, we can abuse notation somewhat by writing the
following:
We can then characterize agent i's preferences over the possible outcomes in the
following way:
C,C 2 i C,D > i D , C k i D , D .
Now, consider the following question.
If you were agent i in this scenario, what would you choose to do -
cooperate or defect?
In this case (I hope), the answer is pretty unambiguous. Agent i prefers all the out-
comes in which it cooperates over all the outcomes in which it defects. Agent i's
choice is thus clear: it should cooperate. It does not matter what agent j chooses
to do.
In just the same way, agent j prefers all the outcomes in whch it cooperates
over all the outcomes in which it defects. Notice that in t h s scenario, neither agent
has to expend any effort worrylng about what the other agent will do: the action
it should perform does not depend in any way on what the other does.
If both agents in t b s scenario act rationally, that is, they both choose to perform
the action that will lead to their preferred outcomes, then the 'joint' action selected
will be C, C: both agents will cooperate.
Now suppose that, for the same environment, the agents' utility functions were
as follows:
Agent i's preferences over the possible outcomes are thus as follows:
In this scenario, agent i can do no better than to defect. The agent prefers all
the outcomes in whch it defects over all the outcomes in which it cooperates.
Similarly, agent j can do no better than defect: it also prefers all the outcomes in
whch it defects over all the outcomes in whch it cooperates. Once again, the
agents do not need to engage in strategic thnking (worrying about what the
other agent will do): the best action to perform is entirely independent of the
other agent's choice. I emphasize that in most multiagent scenarios, the choice
an agent should make is not so clear cut; indeed, most are much more diffi-
cult.
Dominant Strategies and Nash Equilibria 111
I i defects I i cooperates
The way to read such a payoff matrix is as follows. Each of the four cells in the
matrix corresponds to one of the four possible outcomes. For example, the top-
right cell corresponds to the outcome in which i cooperates and j defects; the
bottom-left cell corresponds to the outcome in which i defects and j cooperates.
The payoffs received by the two agents are written in the cell. The value in the
top right of each cell is the payoff received by player i (the column player), while
the value in the bottom left of each cell is the payoff received by agent j (the
row player). As payoff matrices are standard in the literature, and are a much
more succinct notation than the alternatives, we will use them as standard in the
remainder of t h s chapter.
Before proceeding to consider any specific examples of multiagent encounter,
let us introduce some of the theory that underpins the lund of analysis we have
informally discussed above.
Then strongly dominates i,12since wl > i W-J, CUI > i wq, w2 > i w3, and
w2 > i C U ~However,
. a2does not strongly dominate R1, since (for example), it is
not the case that w3 > i C U I .
Formally, a set of outcomes R1 strongly dominates set R2 if the following con-
dition is true:
Now, in order to bring ourselves in line with the game-theory literature, we will
start referring to actions (members of the set Ac) as strategies. Given any par-
ticular strategy s for an agent i in a multiagent interaction scenario, there will
be a number of possible outcomes. Let us denote by s* the outcomes that may
arise by i playing strategy s. For example, referring to the example environment
in Equation (6.1), from agent i's point of view we have C* = ( w 3 ,w 4 ) , while
D* = { w l ,w 2 } .
Now, we will say a strategy sl dominates a strategy sz if the set of outcomes
possible by playing sl dominates the set possible by playing s2, that is, if ST dom-
inates s;. Again, referring back to the example of (6.5),it should be clear that, for
agent i, cooperate strongly dominates defect. Indeed, as there are only two strate-
gies available, the cooperate strategy is dominant: it is not dominated by any other
strategy. The presence of a dominant strategy makes the decision about what to
do extremely easy: the agent guarantees its best outcome by performing the dom-
inant strategy. In following a dominant strategy, an agent guarantees itself the
best possible payoff.
Another way of looking at dominance is that if a strategy s is dominated by
another strategy s f ,then a rational agent will not follow s (because it can guaran-
tee to do better with s'). When considering what to do, this allows us to delete
dominated strategies from our consideration, simplifying the analysis consid-
erably. The idea is to iteratively consider each strategy s in turn, and if there
is another remaining strategy that strongly dominates it, then delete strategy s
from consideration. If we end up with a single strategy remaining, then this will be
the dominant strategy, and is clearly the rational choice. Unfortunately, for many
interaction scenarios, there will not be a strongly dominant strategy; after delet-
ing strongly dominated strategies, we may find more than one strategy remaining.
What to do then? Well, we can start to delete weakly dominated strategies. A strat-
egy sl is said to weakly dominate strategy s2 if every outcome s;" is preferred at
least as much as every outcome s;. The problem is that if a strategy is only weakly
dominated, then it is not necessarily irrational to use it; in deleting weakly domi-
nated strategies, we may therefore 'throw away' a strategy that would in fact have
been useful to use. We will not take this discussion further; see the Notes and
Further Reading section at the end of this chapter for pointers to the literature.
The next notion we shall discuss is one of the most important concepts in
the game-theory literature, and in turn is one of the most important concepts
in analysing multiagent systems. The notion is that of equilibrium, and, more
Competitive and Zero-Sum interactions 113
The preferences of the players are thus diametrically opposed to one anot her: one
agent can only improve its lot (i.e. get a more preferred outcome) at the expense of
the other. An interaction scenario that satisfies this property is said to be strictly
competitive, for hopefully obvious reasons.
114 Multiagent Interactions
Zero-sum encounters are those in which, for any particular outcome, the utilities
of the two agents sum to zero. Formally, a scenario is said to be zero sum if the
following condition is satisfied:
It should be easy to see that any zero-sum scenario is strictly competitive. Zero-
sum encounters are important because they are the most 'vicious' types of
encounter conceivable, allowing for no possibility of cooperative behaviour. If
you allow your opponent positive utility, then t h s means that you get negative
utility - intuitively, you are worse off than you were before the interaction.
Games such as chess and chequers are the most obvious examples of strictly
competitive interactions. Indeed, any game in which the possible outcomes are
win or lose will be strictly competitive. Outside these rather abstract settings,
however, it is hard to think of real-world examples of zero-sum encounters. War
might be cited as a zero-sum interaction between nations, but even in the most
extreme wars, there will usually be at least some common interest between the
participants (e.g. in ensuring that the planet survives). Perhaps games like chess -
which are a highly stylized form of interaction - are the only real-world examples
of zero-sum encounters.
For these reasons, some social scientists are sceptical about whether zero-sum
games exist in real-world scenarios (Zagare, 1984, p. 22). Interestingly, however,
people interacting in many scenarios have a tendency to treat them as if they were
zero sum. Below, we will see that in some scenarios - where there is the possibility
of mutually beneficial cooperation - this type of behaviour can be damaging.
Enough abstract theory! Let us now apply this theory to some actual multiagent
scenarios. First, let us consider what is perhaps the best-known scenario: the pris-
oner's dilemma.
j cooperates
0 3
Note that the numbers in the payoff matrix do not refer to years in prison. They
capture how good an outcome is for the agents - the shorter jail term, the better.
In other words, the utilities are
What should a prisoner do? The answer is not as clear cut as the previous examples
we looked at. It is not the case a prisoner prefers all the outcomes in which it
cooperates over all the outcomes in w h c h it defects. Similarly, it is not the case
that a prisoner prefers all the outcomes in which it defects over all the outcomes
in which it cooperates.
The 'standard' approach to this problem is to put yourself in the place of a
prisoner, i say, and reason as follows.
Suppose I cooperate. Then if j cooperates, we will both get a payoff of 3.
But if j defects, then I will get a payoff of 0. So the best payoff I can be
guaranteed to get if I cooperate is 0.
Suppose I defect. Then if j cooperates, then I get a payoff of 5 , whereas if j
defects, then I will get a payoff of 2. So the best payoff I can be guaranteed
to get if I defect is 2 .
So, if I cooperate, the worst case is I will get a payoff of 0, whereas if I defect,
the worst case is that I will get 2.
116 Multiagent Interactions
Since the scenario is symmetric (i.e. both agents reason the same way), then the
outcome that will emerge - if both agents reason 'rationally' - is that both agents
will defect, giving them each a payoff off 2.
Notice that neither strategy strongly dominates in this scenario, so our first
route to finding a choice of strategy is not going to work. Turning to Nash equi-
libria, there is a single Nash equilibrium of D, D. Thus under the assumption that
i will play D, j can do no better than play D, and under the assumption that j will
play D, i can also do no better than play D.
Is this the best they can do? Naive intuition says not. Surely if they both coop-
erated, then they could do better - they would receive a payoff of 3. But if you
assume the other agent will cooperate, then the rational thing to do - the thing
that maximizes your utility - is to defect, The conclusion seems inescapable: the
rational thing to do in the prisoner's dilemma is defect, even though t h s appears
to 'waste' some utility. (The fact that our naive intuition tells us that utility appears
to be wasted here, and that the agents could do better by cooperating, even though
the rational thing to do is to defect, is why this is referred to as a dilemma.)
The prisoner's dilemma may seem an abstract problem, but it turns out to be
very common indeed. In the real world, the prisoner's dilemma appears in situa-
tions ranging from nuclear weapons treaty compliance to negotiating with one's
children. Consider the problem of nuclear weapons treaty compliance. Two coun-
tries i and j have signed a treaty to dispose of their nuclear weapons. Each country
can then either cooperate (i.e. get rid of their weapons), or defect (i.e. keep their
weapons). But if you get rid of your weapons, you run the risk that the other
side keeps theirs, malung them very well off while you suffer what is called the
'sucker's payoff'. In contrast, if you keep yours, then the possible outcomes are
that you will have nuclear weapons while the other country does not (a very good
outcome for you), or else at worst that you both retain your weapons. This may
not be the best possible outcome, but is certainly better than you giving up your
weapons while your opponent kept theirs, which is what you risk if your give up
your weapons.
Many people find the conclusion of this analysis - that the rational thing to
do in the prisoner's dilemma is defect - deeply upsetting. For the result seems
to imply that cooperation can only arise as a result of irrational behaviour, and
that cooperative behaviour can be exploited by those who behave rationally. The
apparent conclusion is that nature really is 'red in tooth and claw'. Particularly
for those who are inclined to a liberal view of the world, this is unsettling and
perhaps even distasteful. As civilized beings, we tend to pride ourselves on some-
how 'rising above' the other animals in the world, and believe that we are capable
of nobler behaviour: to argue in favour of such an analysis is therefore somehow
immoral, and even demeaning to the entire human race.
The Prisoner's Dilemma 117
Naturally enough, there have been several attempts to respond to this analy-
sis of the prisoner's dilemma, in order to 'recover' cooperation (Binmore, 1992,
pp. 3 5 5-382).
are sufficiently aligned, they will both recognize the benefits of cooperation, and
behave accordingly. The answer to this is that it implies there are not actually two
prisoners playing the game. If I can make my twin select a course of action simply
by 'thinking it', then we are not playing the prisoner's dilemma at all.
This 'fallacy of the twins' argument often takes the form 'what if everyone were
to behave like that' (Binmore, 1992, p. 3 11).The answer (as Yossarian pointed out
in Joseph Heller's Catch 22) is that if everyone else behaved like that, you would
be a damn fool to behave any other way.
So, if you play the prisoner's dilemma game indefinitely, then cooperation is a
rational outcome (Binmore, 1992, p. 358). The 'shadow of the future' encourages
us to cooperate in the infinitely repeated prisoner's dilemma game.
Ths seems to be very good news indeed, as truly one-shot games are compar-
atively scarce in real life. When we interact with someone, then there is often a
good chance that we will interact with them in the future, and rational cooperation
begins to look possible. However, there is a catch.
Suppose you agree to play the iterated prisoner's dilemma a fixed number of
times (say 100). You need to decide (presumably in advance) what your strategy
for playing the game will be. Consider the last round (i.e. the 100th game). Now,
on t h s round, you know - as does your opponent - that you will not be interacting
again. In other words, the last round is in effect a one-shot prisoner's dilemma
game. As we know from the analysis above, the rational thng to do in a one-
shot prisoner's dilemma game is defect. Your opponent, as a rational agent, will
presumably reason likewise, and will also defect. On the 100th round, therefore,
you will both defect. But this means that the last 'real' round, is 99. But similar
reasoning leads us to the conclusion that this round will also be treated in effect
like a one-shot prisoner's dilemma, and so on. Continuing this backwards induc-
tion leads inevitably to the conclusion that, in the iterated prisoner's dilemma
with a fixed, predetermined, commonly known number of rounds, defection is
the dominant strategy, as in the one-shot version (Binmore, 1992, p. 354).
Whereas it seemed to be very good news that rational cooperation is possible in
the iterated prisoner's dilemma with an infinite number of rounds, it seems to be
very bad news that this possibility appears to evaporate if we restrict ourselves
to repeating the game a predetermined, fixed number of times. Returning to the
real-world, we know that in reality, we will only interact with our opponents a
h t e number of times (after all, one day the world will end). We appear to be
back where we started.
The story is actually better than it might at first appear, for several reasons.
The first is that actually playing the game an infinite number of times is not
necessary. As long as the 'shadow of the future' looms sufficiently large, then it
can encourage cooperation. So, rational cooperation can become possible if both
players know, with sufficient probability, that they will meet and play the game
again in the future.
The second reason is that, even though a cooperative agent can suffer when
playing against a defecting opponent, it can do well overall provided it gets suf-
ficient opportunity to interact with other cooperative agents. To understand how
t h s idea works, we will now turn to one of the best-known pieces of multiagent
systems research: Axelrod's prisoner's dilemma tournament.
Axelrod's tournament
Robert Axelrod was (indeed, is) a political scientist interested in how coopera-
tion can arise in societies of self-interested agents. In 1980, he organized a pub-
120 Multiagent Interactions
than this. TIT-FOR-TAT won because the overall score was computed by takmg
into account all the strategies that it played against. The result when TIT-FOR-
TAT was played against ALL-D was exactly as might be expected: ALL-D came
out on top. Many people have misinterpreted these results as meaning that TIT-
FOR-TAT is the optimal strategy in the iterated prisoner's dilemma. You should
be careful not to interpret Axelrod's results in this way. TIT-FOR-TATwas able to
succeed because it had the opportunity to play against other programs that were
also inclined to cooperate. Provided the environment in whch TIT-FOR-TATplays
contains sufficient opportunity to interact with other 'like-minded' strategies, TIT-
FOR-TAT can prosper. The TIT-FOR-TATstrategy will not prosper if it is forced to
interact with strategies that tend to defect.
Axelrod attempted to characterize the reasons for the success of TIT-FOR-TAT,
and came up with the following four rules for success in the iterated prisoner's
dilemma.
(1) Do not be envious. In the prisoner's dilemma, it is not necessary for you to
'beat' your opponent in order for you to do well.
(2) Do not be the first to defect. Axelrod refers to a program as 'nice' if it starts
by cooperating. He found that whether or not a rule was nice was the single best
predictor of success in h s tournaments. There is clearly a risk in starting with
cooperation. But the loss of utility associated with receiving the sucker's payoff
on the first round will be comparatively small compared with possible benefits
of mutual cooperation with another nice strategy.
(3) Reciprocate cooperation and defection. As Axelrod puts it, 'TIT-FOR-TAT
represents a balance between punishing and being forgiving' (Axelrod, 1984,
p. 119): the combination of punishing defection and rewarding cooperation
seems to encourage cooperation. Although TIT-FOR-TAT can be exploited on
the first round, it retaliates relentlessly for such non-cooperative behaviour.
Moreover, TIT-FOR-TAT punishes with exactly the same degree of violence that
it was the recipient of: in other words, it never 'overreacts' to defection. In addi-
tion, because TIT-FOR-TATis forgiving (it rewards cooperation), it is possible
for cooperation to become established even following a poor start.
(4) Do not be too clever. As noted above, TIT-FOR-TAT was the simplest pro-
gram entered into Axelrod's competition. Either surprisingly or not, depending
on your point of view, it fared significantly better than other programs that
attempted to make use of comparatively advanced programming techniques in
order to decide what to do. Axelrod suggests three reasons for this:
(a) the most complex entries attempted to develop a model of the behaviour of
the other agent while ignoring the fact that t h s agent was in turn watching
the original agent - they lacked a model of the reciprocal learning that
actually takes place;
122 Multiagent Interactions
(b) most complex entries over generalized when seeing their opponent defect,
and did not allow for the fact that cooperation was still possible in the
future - they were not forgiving;
(c) many complex entries exhibited behaviour that was too complex to be
understood - to their opponent, they may as well have been acting ran-
domly.
From the amount of space we have devoted to discussing it, you might assume
that the prisoner's dilemma was the only type of multiagent interaction there is.
This is not the case.
T h s is just one of the possible orderings of outcomes that agents may have. If we
restrict our attention to interactions in which there are two agents, each agent has
two possible actions (C or D), and the scenario is symmetric, then there are 4! = 24
possible orderings of preferences, whch for completeness I have summarized in
Table 6.1. (In the game-theory literature, these are referred to as symmetric 2 x 2
games.)
In many of these scenarios, what an agent should do is clear-cut. For example,
agent i should clearly cooperate in scenarios (1) and (2),as both of the outcomes
in which i cooperates are preferred over both of the outcomes in whch i defects.
Similarly, in scenarios (23) and (24), agent i should clearly defect, as both out-
comes in which it defects are preferred over both outcomes in whch it cooper-
ates. Scenario (14) is the prisoner's dilemma, whch we have already discussed at
length, which leaves us with two other interesting cases to examine: the stag hunt
and the game of chicken.
Table 6.1 The possible preferences that agent i can have in symmetric interaction sce-
narios where there are two agents, each of which has two available actions, C (cooperate)
and D (defect); recall that X, Y means the outcome in which agent i plays X and agent j
plays Y.
than you defecting while your opponent cooperates. Expressing the game in a
payoff matrix (picking rather arbitrary payoffs to give the preferences):
I i defects 1 i cooperates
j defects
j cooperates
0
It should be clear that there are two Nash equilibria in this game: mutual defec-
tion, or mutual cooperation. If you trust your opponent, and believe that he will
cooperate, then you can do no better than cooperate, and vice versa, your oppo-
nent can also do no better than cooperate. Conversely, if you believe your oppo-
nent will defect, then you can do no better than defect yourself, and vice versa.
Poundstone suggests that 'mutiny' scenarios are examples of the stag hunt:
'We'd all be better off if we got rid of Captain Bligh, but we'll be hung as mutineers
if not enough of us go along' (Poundstone, 1992, p. 220).
As with the stag hunt, this game is also closely related to the prisoner's dilemma.
The difference here is that mutual defection is agent i's most feared outcome,
rather than i cooperating while j defects. The game of chicken gets its name
from a rather silly, macho 'game' that was supposedly popular amongst juvenile
delinquents in 1950s America; the game was immortalized by James Dean in the
film Rebel Without a Cause. The purpose of the game is to establish who is bravest
out of two young thugs. The game is played by both players driving their cars at
high speed towards a cliff. The idea is that the least brave of the two (the 'chicken')
will be the first to drop out of the game by steering away from the cliff. The winner
is the one who lasts longest in the car. Of course, if neither player steers away,
then both cars fly off the cliff, tahng their foolish passengers to a fiery death on
the rocks that undoubtedly lie below.
So, how should agent i play this game? It depends on how brave (or foolish) i
believes j is. If i believes that j is braver than i, then i would do best to steer away
from the cliff (i.e. cooperate), since it is unlikely that j will steer away from the
cliff. However, if i believes that j is less brave than i, then i should stay in the car;
because j is less brave, he will steer away first, allowing i to win. The difficulty
arises when both agents mistakenly believe that the other is less brave; in this
case, both agents will stay in their car (i.e. defect), and the worst outcome arises.
Dependence Relations in Multiagent Systems 12 5
3 2
j cooperates
1 2
It should be clear that the game of chicken has two Nash equilibria, corresponding
to the above-right and below-left cells. Thus if you believe that your opponent is
going to drive straight (i-e.defect), then you can do no better than to steer away
from the cliff, and vice versa. Similarly, if you believe your opponent is going to
steer away, then you can do no better than to drive straight.
Unilateral. One agent depends on the other, but not vice versa.
Mutual. Both agents depend on each other with respect to the same goal.
Reciprocal dependence. The first agent depends on the other for some goal,
while the second also depends on the first for some goal (the two goals are not
necessarily the same). Note that mutual dependence implies reciprocal depen-
dence.
These relationships may be qualified by whether or not they are locally believed
or mutually believed. There is a locally believed dependence if one agent believes
the dependence exists, but does not believe that the other agent believes it exists.
A mutually believed dependence exists when the agent believes the dependence
exists, and also believes that the other agent is aware of it. Sichman and colleagues
implemented a social reasoning system called DepNet (Sichman et al., 1994). Given
a description of a multiagent system, DepNet was capable of computing the rela-
tionships that existed between agents in the system.
126 Multiagent Interactions
Exercises
(1)[Level 1.1
Consider the following sets of outcomes and preferences:
fl:{ = { w , ~and
};
fl4 = { w 2 0, 6 ; ) .
Which of these sets (if any) dominates the others'! Where neither set dominates the other,
indicate this.
(2) [Level 2.1
Consider the following interaction scenarios:
I i defects I i cooperates
3 4
j defects
3 2
1 2
j cooperates
1 4
I i defects I i cooperates
-1 2
j defects
-1 1
-1
j cooperates
1 2 1 - 1
I i defects 1 i cooDerates
3 4
j defects
3 2
2
j cooperates
I 1 I 4
( 3 ) [Class discussion.]
This is best done as a class exercise, in groups of three: play the prisoner's dilemma.
Use one of the three as 'umpire', to keep track of progress and scores, and to stop any
outbreaks of violence. First try playing the one-shot game a few times, and then try the
iterated version, first for an agreed, predetermined number of times, and then allowing
the umpire to choose how many times to iterate without telling the players.
Which strategies do best in the one-shot and iterated prisoner's dilemma?
Try playing people against strategies such as TIT-FOR-TAT, and ALL-D.
Try getting people to define their strategy precisely in advance (by writing it down),
and then see if you can determine their strategy while playing the game; distribute
their strategy, and see if it can be exploited.
negotiating on our behalf, it is not enough simply to have agents that get the best
outcome in theory - they must be able to obtain the best outcome in practice.
In the remainder of this chapter, I will discuss the process of reaching agree-
ments through negotiation and argumentation. I will start by considering the issue
of mechanism design - broadly, what properties we might want a negotiation or
argumentation protocol to have - and then go on to discuss auctions, negotiation
protocols and strategies, and finally argumentation.
Mechanism Design
As noted above, mechanism design is the design of protocols for governing multi-
agent interactions, such that these protocols have certain desirable properties.
When we design 'conventional' communication protocols, we typically aim to
design them so that (for example) they are provably free of deadlocks, live-
locks, and so on (Holzmann, 1991). In multiagent systems, we are still con-
cerned with such issues of course, but for negotiation protocols, the properties we
would like to prove are slightly different. Possible properties include, for example
(Sandholm, 1999, p. 204), the following.
Guaranteed success. A protocol guarantees success if it ensures that, eventually,
agreement is certain to be reached.
Maximizing social welfare. Intuitively, a protocol maximizes social welfare if it
ensures that any outcome maximizes the sum of the utilities of negotiation par-
ticipants. If the utility of an outcome for an agent was simply defined in terms
of the amount of money that agent received in the outcome, then a protocol that
maximized social welfare would maximize the total amount of money 'paid out'.
Pareto efficiency. A negotiation outcome is said to be pareto efficient if there is
no other outcome that will make at least one agent better off without malung
at least one other agent worse off. Intuitively, if a negotiation outcome is not
pareto efficient, then there is another outcome that will make at least one agent
happier while keeping everyone else at least as happy.
Individual rationality. A protocol is said to be individually rational if following
the protocol - 'playing by the rules' - is in the best interests of negotiation
participants. Individually rational protocols are essential because without them,
there is no incentive for agents to engage in negotiations.
Stability. A protocol is stable if it provides all agents with an incentive to behave
in a particular way. The best-known kind of stability is Nash eqruilibrium, as
discussed in the preceding chapter.
'implicity. A 'simple' protocol is one that makes the appropriate strategy for
a negotiation participant 'obvious'. That is, a protocol is simple if using it, a
participant can easily (tractably) determine the optimal strategy.
Auctions 131
Auctions
Auctions used to be comparatively rare in everyday life; every now and then, one
would hear of astronomical sums paid at auction for a painting by Monet or Van
Gogh, but other than this, they did not enter the lives of the majority. The Internet
and Web fundamentally changed this. The Web made it possible for auctions with a
large, international audience to be carried out at very low cost. This in turn made
it possible for goods to be put up for auction which hitherto would have been
too uneconomical. Large businesses have sprung up around the idea of online
auctions, with eBay being perhaps the best-known example (EBAY, 200 1).
One of the reasons why online auctions have become so popular is that auctions
are extremely simple interaction scenarios. This means that it is easy to automate
auctions; this makes them a good first choice for consideration as a way for agents
to reach agreements. Despite their simplicity, auctions present both a rich collec-
tion of problems for researchers, and a powerful tool that automated agents can
use for allocating goods, tasks, and resources.
Abstractly, an auction takes place between an agent known as the auctioneer
and a collection of agents known as the bidders. The goal of the auction is for the
auctioneer to allocate the good to one of the bidders. In most settings - and cer-
tainly most traditional auction settings - the auctioneer desires to maximize the
price at which the good is allocated, while bidders desire to minimize price. The
auctioneer will attempt to achieve his desire through the design of an appropriate
auction mechanism - the rules of encounter - while bidders attempt to achieve
their desires by using a strategy that will conform to the rules of encounter, but
that will also deliver an optimal result.
There are several factors that can affect both the protocol and the strategy that
agents use. The most important of these is whether the good for auction has a
private or a public/common value. Consider an auction for a one dollar bill. How
much is this dollar bill worth to you? Assuming it is a 'typical' dollar bill, then
it should be worth exactly $1; if you paid $2 for it, you would bc $1 worse off
than you were. The same goes for anyone else involved in this auction. A typical
dollar bill thus has a common value: it is worth exactly the same to all bidders in
the auction. However, suppose you were a big fan of the Beatles, and the dollar
bill happened to be the last dollar bill that John Lennon spent. Then it may well
be that, for sentimental reasons, this dollar bill was worth considerably more to
132 Reaching Agreements
you - you might be willing to pay $100 for it. To a fan of the Rolling Stones, with
no interest in or liking for the Beatles, however, the bill might not have the same
value. Someone with no interest in the Beatles whatsoever might value the one
dollar bill at exactly $1.In this case, the good for auction - the dollar bill - is said
to have a private value: each agent values it differently.
A third type of valuation is correlated value: in such a setting, an agent's valu-
ation of the good depends partly on private factors, and partly on other agent's
valuation of it. An example might be where an agent was bidding for a painting
that it liked, but wanted to keep open the option of later selling the painting. In
this case, the amount you would be willing to pay would depend partly on how
much you liked it, but also partly on how much you believed other agents might
be willing to pay for it if you put it up for auction later.
Let us turn now to consider some of the dimensions along which auction pro-
tocols may vary. The first is that of winner determination: who gets the good that
the bidders are bidding for. In the auctions with which we are most familiar, the
answer to this question is probably self-evident: the agent that bids the most is
allocated the good. Such protocols are known as first-price auctions. This is not
the only possibility, however. A second possibility is to allocate the good to the
agent that bid the highest, but this agent pays only the amount of the second
highest bid. Such auctions are known as second-price auctions.
At first sight, it may seem bizarre that there are any settings in which a second-
price auction is desirable, as this implies that the auctioneer does not get as much
for the good as it could do. However, we shall see below that there are indeed some
settings in which a second-price auction is desirable.
The second dimension along which auction protocols can vary is whether or
not the bids made by the agents are known to each other. If every agent can see
what every other agent is bidding (the terminology is that the bids are common
knowledge), then the auction is said to be open cry. If the agents are not able to
determine the bids made by other agents, then the auction is said to be a sealed-bid
auction.
A third dimension is the mechanism by whch bidding proceeds. The simplest
possibility is to have a single round of bidding, after which the auctioneer allo-
cates the good to the winner. Such auctions are known as one shot. The second
possibility is that the price starts low (often at a reservation price) and successive
bids are for increasingly large amounts. Such auctions are known as ascending.
The alternative - descending - is for the auctioneer to start off with a high value,
and to decrease the price in successive rounds.
English auctions
English auctions are the most commonly known type of auction, made famous by
such auction houses as Sothebys. English auction are first-price, open cry, ascend-
ing auctions:
Auctions 133
the auctioneer starts off by suggesting a reservation price for the good (which
may be 0) - if no agent is willing to bid more than the reservation price, then
the good is allocated to the auctioneer for this amount;
bids are then invited from agents, who must bid more than the current high-
est bid - all agents can see the bids being made, and are able to participate
in the bidding process if they so desire;
when no agent is willing to raise the bid, then the good is allocated to the
agent that has made the current highest bid, and the price they pay for the
good is the amount of this bid.
What strategy should an agent use to bid in English auctions? It turns out that
the dominant strategy is for an agent to successively bid a small amount more
than the current highest bid until the bid price reaches their current valuation,
and then to withdraw.
Simple though English auctions are, it turns out that they have some interesting
properties. One interesting feature of English auctions arises when there is uncer-
tainty about the true value of the good being auctioned. For example, suppose an
auctioneer is selling some land to agents that want to exploit it for its mineral
resources, and that there is limited geological information available about this
land. None of the agents thus knows exactly what the land is worth. Suppose now
that the agents engage in an English auction to obtain the land, each using the
dominant strategy described above. When the auction is over, should the winner
feel happy that they have obtained the land for less than or equal to their private
valuation? Or should they feel worried because no other agent valued the land
so highly? This situation, where the winner is the one who overvalues the good
on offer, is known as the winner's curse. Its occurrence is not limited to English
auctions, but occurs most frequently in these.
Dutch auctions
Dutch auctions are examples of open-cry descending auctions:
the auctioneer starts out offering the good at some artificially high value
(above the expected value of any bidder's valuation of it);
the auctioneer then continually lowers the offer price of the good by some
small value, until some agent makes a bid for the good which is equal to the
current offer price;
the good is then allocated to the agent that made the offer.
Notice that Dutch auctions are also susceptible to the winner's curse. There is no
dominant strategy for Dutch auctions in general.
134 Reaching Agreements
Vickrey auctions
The next type of auction is the most unusual and perhaps most counterintuitive
of all the auction types we shall consider. Vickrey auctions are second-price sealed-
bid auctions. This means that there is a single negotiation round, during which
each bidder submits a single bid; bidders do not get to see the bids made by other
agents. The good is awarded to the agent that made the highest bid; however
the price this agent pays is not the price of the highest bid, but the price of the
second highest bid. Thus if the hghest bid was made by agent i, who bid $9, and
the second highest bid was by agent j , who bid $8, then agent i would win the
auction and be allocated the good, but agent i would only pay $8.
Why would one even consider using Vickrey auctions? The answer is that Vick-
rey auctions make truth telling the dominant strategy: a bidder's dominant strat-
egy in a private value Vickrey auction is to bid his true valuation. Consider why
this is.
Suppose that you bid more than your true valuation. In thls case, you may
be awarded the good, but you run the risk of being awarded the good but
at more than the amount of your private valuation. If you win in such a
circumstance, then you make a loss (since you paid more than you believed
the good was worth).
Suppose you bid less than your true valuation. In this case, note that you
stand less chance of winning than if you had bid your true valuation. But,
even if you do win, the amount you pay will not have been affected by the
fact that you bid less than your true valuation, because you will pay the price
of the second highest bid.
Auctions 135
ThlIS the best thing to do in a Vickre y auction is to bid trl~thfully:to bid to your
private valuation - no more and no less.
Because they make truth telling the dominant strategy, Vickrey auctions have
received a lot of attention in the multiagent systems literature (see Sandholm
(1999, p. 213) for references). However, they are not widely used in human auc-
tions. There are several reasons for this, but perhaps the most important is that
humans frequently find the Vickrey mechanism hard to understand, because at
first sight it seems so counterintuitive. In terms of the desirable attributes that
we discussed above, it is not simple for humans to understand.
Note that Vickrey auctions make it possible for antisocial behaviour. Suppose
you want some good and your private valuation is $90, but you know that some
other agent wants it and values it at $100. As truth telling is the dominant strategy,
you can do no better than bid $90; your opponent bids $100, is awarded the good,
but pays only $90. Well, maybe you are not too happy about this: maybe you would
like to 'punish' your successful opponent. How can you do this? Suppose you bid
$99 instead of $90. Then you still lose the good to your opponent - but he pays $9
more than he would do i f you had bid truthfully. To make t h s work, of course, you
have to be very confident about what your opponent will bid - you do not want
to bid $99 only to discover that your opponent bid $95, and you were left with a
good that cost $5 more than your private valuation. This kind of behaviour occurs
in commercial situations, where one company may not be able to compete directly
with another company, but uses their position to try to force the opposition into
bankruptcy.
Expected revenue
There are several issues that should be mentioned relating to the types of auctions
discussed above. The first is that of expected revenue. If you are an auctioneer,
then as mentioned above, your overriding consideration will in all likelihood be
to maximize your revenue: you want an auction protocol that will get you the
hlghest possible price for the good on offer. You may well not be concerned with
whether or not agents tell the truth, or whether they are afflicted by the winner's
curse. It may seem that some protocols - Vickrey's mechanism in particular - do
not encourage t h s . So, whch should the auctioneer choose?
For private value auctions, the answer depends partly on the attitude to risk of
both auctioneers and bidders (Sandholm, 1999, p. 214).
For risk-neutral bidders, the expected revenue to the auctioneer is provably
identical in all four types of auctions discussed above (under certain simple
assumptions). That is, the auctioneer can expect on average to get the same
revenue for the good using all of these types of auction.
For risk-averse bidders (i.e. bidders that would prefer to get the good even
if they paid slightly more for it than their private valuation), Dutch and
136 Reaching Agreements
first-price sealed-bid protocols lead to hgher expected revenue for the auc-
tioneer. This is because in these protocols, a risk-averse agent can 'insure'
himself by bidding slightly more for the good than would be offered by a
risk-neutral bidder.
Risk-uverse auctioneers, however, do better with Vickrey or English auctions.
Note that these results should be treated very carefully. For example, the first
result, relating to the revenue equivalence of auctions given risk-neutral bidders,
depends critically on the fact that bidders really do have private valuations. In
choosing an appropriate protocol, it is therefore critical to ensure that the prop-
erties of the auction scenario - and the bidders - are understood correctly.
Counterspeculation
Before we leave auctions, there is at least one other issue worth mentioning: that of
counterspeculation.T h s is the process of a bidder engaging in an activity in order
to obtain information either about the true value of the good on offer, or about
the valuations of other bidders. Clearly, if counterspeculation was free (i.e. it did
not cost anythng in terms of time or money) and accurate (i.e. counterspeculation
would accurately reduce an agent's uncertainty either about the true value of the
good or the value placed on it by other bidders), then every agent would engage in
it at every opportunity. However, in most settings, counterspeculation is not free:
it may have a time cost and a monetary cost. The time cost will matter in auction
settings (e.g. English or Dutch) that depend heavily on the time at which a bid is
made. Slrnilarly,investing money in counterspeculation will only be worth it if, as a
result, the bidder can expect to be no worse off than if it did not counterspeculate.
In deciding whether to speculate, there is clearly a tradeoff to be made, balancing
the potential gains of counterspeculation against the costs (money and time) that
itwill entail. (It is worth mentioning that counterspeculation can be thought of as a
kind of meta-level reasoning, and the nature of these tradeoffsis thus very similar
to that of the tradeoffs discussed in practical reasoning agents as discussed in
earlier chapters.)
Negotiation
Auctions are a very useful techniques for allocating goods to agents. However,
they are too simple for many settings: they are only concerned with the allocation
of goods. For more general settings, where agents must reach agreements on mat-
ters of mutual interest, richer techniques for reaching agreements are required.
Negotiation is the generic name given to such techniques. In this section, we will
consider some negotiation techniques that have been proposed for use by artifi-
cial agents - we will focus on the work of Rosenschein and Zlotlun (1994). One
of the most important contributions of their work was to introduce a distinction
between different types of negotiation domain: in particular, they distinguished
between task-oriented domains and worth-oriented domains.
Before we start to discuss this work, however, it is worth saying a few words
about negotiation techniques in general. In general, any negotiation setting will
have four different components.
A negotiation set, which represents the space of possible proposals that
agents can make.
A protocol, which defines the legal proposals that agents can make, as a
function of prior negotiation history.
A collection of strategies, one for each agent, which determine what propos-
als the agents will make. Usually, the strategy that an agent plays is private:
138 Reaching Agreements
the fact that an agent is using a particular strategy is not generally visible to
other negotiation participants (although most negotiation settings are 'open
cry', in the sense that the actual proposals that are made are seen by all par-
ticipants).
A rule that determines when a deal has been struck, and what t h s agreement
deal is.
Negotiation usually proceeds in a series of rounds, with every agent making a
proposal at every round. The proposals that agents make are defined by their
strategy, must be drawn from the negotiation set, and must be legal, as defined
by the protocol. If agreement is reached, as defined by the agreement rule, then
negotiation terminates with the agreement deal.
These four parameters lead to an extremely rich and complex environment for
analysis.
The first attribute that may complicate negotiation is where multiple issues are
involved. An example of a single-issue negotiation scenario might be where two
agents were negotiating only the price of a particular good for sale. In such a
scenario, the preferences of the agents are symmetric, in that a deal which is more
preferred from one agent's point of view is guaranteed to be less preferred from
the other's point of view, and vice versa. Such symmetric scenarios are simple to
analyse because it is always obvious what represents a concession: in order for
the seller to concede, he must lower the price of his proposal, while for the buyer
to concede, he must raise the price of his proposal. In multiple-issue negotiation
scenarios, agents negotiate over not just the value of a single attribute, but over
the values of multiple attributes, which may be interrelated. For example, when
buying a car, price is not the only issue to be negotiated (although it may be
the dominant one). In addition, the buyer might be interested in the length of
the guarantee, the terms of after-sales service, the extras that might be included
such as air conditioning, stereos, and so on. In multiple-issue negotiations, it is
usually much less obvious what represents a true concession: it is not simply the
case that all attribute values must be either increased or decreased. (Salesmen in
general, and car salesmen in particular, often exploit this fact during negotiation
by making 'concessions' that are in fact no such thing.)
Multiple attributes also lead to an exponential growth in the space of possible
deals. Let us take an example of a domain in w h c h agents are negotiating over
the value of n Boolean variables, v l , . . . , v,. A deal in such a setting consists of
an assignment of either true or false to each variable Ui. Obviously, there are Z n
possible deals in such a domain. T h s means that, in attempting to decide what
proposal to make next, it will be entirely unfeasible for an agent to explicitly con-
sider every possible deal in domains of moderate size. Most negotiation domains
are, of course, much more complex than this. For example, agents may need to
negotiate about the value of attributes where these attributes can have m possible
values, leading to a set of nzn possible deals. Worse, the objects of negotiation
Negotiation 239
For these reasons, most attempts to automate the negotiation process have
focused on rather simple settings. Single-issue, symmetric, one-to-one negotia-
tion is the most commonly analysed, and it is on such settings that I will mainly
focus.
Task-oriented domains
The first type of negotiation domains we shall consider in detail are the task-
oriented domains of Rosenschein and Zlotlun (1994, pp. 29-52). Consider the fol-
lowing example.
140 Reaching Agreements
Imagine that you have three chldren, each of whom needs to be deliv-
ered to a different school each morning. Your neighbour has four chl-
dren, and also needs to take them to school. Delivery of each c h l d
can be modelled as an indivisible task. You and your neighbour can
discuss the situation, and come to an agreement that it is better for
both of you (for example, by carrymg the other's c h l d to a shared des-
tination, saving him the trip). There is no concern about being able to
acheve your task by yourself. The worst that can happen is that you
and your neighbour will not come to an agreement about setting up
a car pool, in whch case you are no worse off than if you were alone.
You can only benefit (or do no worse) from your neighbour's tasks.
Assume, though, that one of my chldren and one of my neighbours'
chldren both go to the same school (that is, the cost of carrying out
these two deliveries, or two tasks, is the same as the cost of carrymg
out one of them). It obviously makes sense for both children to be
taken together, and only my neighbour or I will need to make the trip
to carry out both tasks.
What kinds of agreement might we reach? We might decide that I will
take the children on even days each month, and my neighbour will
take them on odd days; perhaps, if there are other chldren involved,
we might have my neighbour always take those two specific chldren,
while I am responsible for the rest of the children.
(Rosenschein and Zlotkin, 1994, p. 29)
To formalize this lund of situation, Rosenschein and Zlotkin defined the notion
of a task-oriented domain (TOD).A task-oriented domain is a triple
where
T is the (finite) set of all possible tasks;
Ag = { l ,. . . , n ) is the (finite) set of negotiation participant agents;
c :p(T) - R+ is a function which defines the cost of executing each subset
of tasks: the cost of executing any set of tasks is a positive real number.
The cost function must satisfy two constraints. First, it must be monotonic. Intu-
itively, this means that adding tasks never decreases the cost. Formally, this con-
straint is defined as follows:
If T I ,T2 G T are sets of tasks such that T1 G T2,then c ( T 1 )< c ( f i ) .
The second constraint is that the cost of doing nothng is zero, i.e. c ( 0 ) = 0.
An encounter within a task-oriented domain (T,Ag, c ) occurs when the agents
Ag are assigned tasks to perform from the set T. Intuitively, when an encounter
Negotiation 14 1
occurs, there is potential for the agents to reach a deal by reallocating the tasks
amongst themselves; as we saw in the informal car pool example above, by reallo-
cating the tasks, the agents can potentially do better than if they simply performed
their tasks themselves. Formally, an encounter in a TOD (T,Ag, c ) is a collection
of tasks
(Tl, - - Tn),
9
where, for all i, we have that i E Ag and Ti c T. Notice that a TOD together
with an encounter in this TOD is a type of task environment, of the kind we saw
in Chapter 2. It defines both the characteristics of the environment in which the
agent must operate, together with a task (or rather, set of tasks), which the agent
must carry out in the environment.
Hereafter, we will restrict our attention to one-to-one negotiation scenarios, as
discussed above: we will assume the two agents in question are { 1 , 2 } . Now, given
an encounter (TI,T2), a deal will be very similar to an encounter: it will be an
allocation of the tasks TI u T2 to the agents 1 and 2. Formally, a pure deal is a pair
{Dl,D2) where Dl u D2 = Tl u T2. The semantics of a deal (Dl,D2) is that agent 1
is committed to performing tasks Dl and agent 2 is committed to performing
tasks D2.
The cost to agent i of a deal 6 = (D1,D2)is defined to be c(Di), and will be
denoted costi (6).The utility of a deal 6 to an agent i is the difference between the
cost of agent i doing the tasks Ti that it was originally assigned in the encounter,
and the cost costi(6) of the tasks it is assigned in 6:
Thus the utility of a deal represents how much the agent has to gain from the deal;
if the utility is negative, then the agent is worse off than if it simply performed
the tasks it was originally allocated in the encounter.
What happens if the agents fail to reach agreement? In this case, they must
perform the tasks (TI,T2) that they were originally allocated. This is the intu-
ition behnd the terminology that the conflict deal, denoted O, is the deal (TI,T2)
consisting of the tasks originally allocated.
The notion of dominance, as discussed in the preceding chapter, can be easily
extended to deals. A deal 61 is said to dominate deal 62 (written 61 >- S2)if and
only if the following hold.
(I) Deal 6 , is at least as good for every agent as 6?:
If deal 61 dominates another deal 62, then it should be clear to all participants
that is better than S2.That is, all 'reasonable' participants would prefer 6 , to
142 Reaching Agreements
utility for
agent i
deals on this line
R ,,---- from B to C are
I conflict deal
I
utility of conflict
deal for j
62. Deal is said to weakly dominate d 2 (written 61 2 a2)if at least the first
condition holds.
A deal that is not dominated by any other deal is said to be pareto optimal.
Formally, a deal 6 is pareto optimal if there is no deal 6' such that 6' > 6. If a
deal is pareto optimal, then there is no alternative deal that will improve the lot
of one agent except at some cost to another agent (who presumably would not be
happy about it!). If a deal is not pareto optimal, however, then the agents could
improve the lot of at least one agent, without malung anyone else worse off.
A deal 6 is said to be individual rational if it weakly dominates the conflict deal.
If a deal is not individual rational, then at least one agent can do better by simply
performing the tasks it was originally allocated - hence it will prefer the conflict
deal. Formally, deal 6 is individual rational if and only if S 2 0.
We are now in a position to define the space of possible proposals that agents
can make. The negotiation set consists of the set of deals that are (i) individual
rational, and (ii) pareto optimal. The intuition behind the first constraint is that
there is no purpose in proposing a deal that is less preferable to some agent than
the conflict deal (as this agent would prefer conflict); the intuition b e h n d the
second condition is that there is no point in making a proposal if an alternative
proposal could make some agent better off at nobody's expense.
The intuition b e h n d the negotiation set is illustrated in Figure 7.1. In this graph,
the space of all conceivable deals is plotted as points on a graph, with the utility to
i on the y-axis, and utility to j on the x-axis. The shaded space enclosed by points
Negotiation 143
A, B, C, and D contains the space of all possible deals. (For convenience, I have
illustrated this space as a circle, although of course it need not be.) The conflict
deal is marked at point E. It follows that all deals to the left of the line B-D will
not be individual rational for agent j (because j could do better with the conflict
deal). For the same reason, all deals below line A-C will not be individual rational
for agent i. This means that the negotiation set contains deals in the shaded area
B-C-E. However, not all deals in this space will be pareto optimal. In fact, the
only pareto optimal deals that are also individual rational for both agents will lie
on the line B-C. Thus the deals that lie on this line are those in the negotiation
set. Typically, agent i will start negotiation by proposing the deal at point B, and
agent j will start by proposing the deal at point C.
u t i l i t y i ( 6 ; ) - utilityi(6:)
otherwise.
u t i l i t y i (6:)
The idea of assigning risk the value 1 if u t i l i t y i ( d : ) = 0 is that in this case, the
utility to i of its current proposal is the same as from the conflict deal; in this
case, i is completely willing to risk conflict by not conceding.
Negotiation 145
So, the Zeuthen strategy proposes that the agent to concede on round t of
negotiation should be the one with the smaller value of risk.
The next question to answer is how much should be conceded? The simple
answer to this question is just enough. If an agent does not concede enough, then
on the next round, the balance of risk will indicate that it still has most to lose
from conflict, and so should concede again. This is clearly inefficient. On the other
hand, if an agent concedes too much, then it 'wastes' some of its utility. Thus an
agent should make the smallest concession necessary to change the balance of
risk - so that on the next round, the other agent will concede.
There is one final refinement that must be made to the strategy. Suppose that,
on the final round of negotiation, both agents have equal risk. Hence, according
to the strategy, both should concede. But, knowing this, one agent can 'defect'
(cf. discussions in the preceding chapter) by not conceding, and so benefit from
the other. If both agents behave in this way, then conflict will arise, and no deal
will be struck. We extend the strategy by an agent 'flipping a coin' to decide who
should concede if ever an equal risk situation is reached on the last negotiation
step.
Now, given the protocol and the associated strategy, to what extent does it
satisfy the desirable criteria for mechanisms discussed at the opening of this
chapter? While the protocol does not guarantee success, it does guarantee termi-
nation; it does not guarantee to maximize social welfare, but it does guarantee
that if agreement is reached, then this agreement will be pareto optimal; it is indi-
vidual rational (if agreement is reached, then this agreement will be better for
both agents than the default, conflict deal); and clearly there is no single point of
failure - it does not require a central arbiter to monitor negotiation. With respect
to simplicity and stability, a few more words are necessary. As we noted above, the
space of possible deals may be exponential in the number of tasks allocated. For
example, in order to execute his strategy, an agent may need to carry out 0 ( 2 I T i )
computations of the cost function (Rosenschein and Zlotkin, 1994, p. 49). This is
clearly not going to be feasible in practice for any realistic number of tasks.
With respect to stability, we here note that the Zeuthen strategy (with the equal
risk rule) is in Nash equilibrium, as discussed in the previous chapter. Thus, under
the assumption that one agent is using the strategy the other can do no better
than use it himself.
This is of particular interest to the designer of automated agents. It
does away with any need for secrecy on the part of the programmer.
An agent's strategy can be publicly known, and no other agent designer
can exploit the information by choosing a different strategy. In fact, it
is desirable that the strategy be known, to avoid inadvertent conflicts.
(Rosenschein and Zlotkin, 1994, p. 46)
An interesting issue arises when one considers that agents need not necessarily
be truthful when declaring their tasks in an encounter. By so doing, they can
146 Reaching Agreements
subvert the negotiation process. There are two obvious ways in which an agent
can be deceitful in such domains as follows.
Phantom and decoy tasks. Perhaps the most obvious way in which an agent can
deceive for personal advantage in task-oriented domains is by pretending to
have been allocated a task that it has not been allocated. These are called phan-
tom tasks. Returning to the car pool example, above, one might pretend that
some additional task was necessary by saying that one had to collect a relative
from a train station, or visit the doctor at the time when the children needed
to be delivered to school. In this way, the apparent structure of the encounter
is changed, so that outcome is in favour of the deceitful agent. The obvious
response to this is to ensure that the tasks an agent has been assigned to carry
out are verifiable by all negotiation participants. In some circumstances, it is
possible for an agent to produce an artificial task when asked for it. Detection
of such decoy tasks is essentially impossible, making it hard to be sure that
deception will not occur in such domains. Whether or not introducing artificial
tasks is beneficial to an agent will depend on the particular TOD in question.
Hidden tasks. Perhaps counterintuitively, it is possible for an agent to benefit
from deception by hiding tasks that it has to perform. Again with respect to
the car pool example, agent 1 might have two children to take to schools that
are close to one another. It takes one hour for the agent to visit both schools,
but only 45 minutes to visit just one. If the neighbour, agent 2, has to take a
child to one of these schools, then by hiding his task of going to one of these
schools, agent 1 can perhaps get agent 2 to take his child, thus improving his
overall utility slightly.
Before we leave task-oriented domains, there are some final comments worth
making. First, the attractiveness of the monotonic concession protocol and
Zeuthen strategy is obvious. They closely mirror the way in which human negoti-
ation seems to work - the assessment of risk in particular is appealing. The Nash
equilibrium status of the (extended) Zeuthen strategy is also attractive. However,
the computational complexity of the approach is a drawback. Moreover, exten-
sions to n > 2 agent negotiation scenarios are not obvious - for the reasons
discussed earlier, the technique works best with symmetric preferences. Never-
theless, variations of the monotonic concession protocol are in wide-scale use,
and the simplicity of the protocol means that many variations on it have been
developed.
Worth-oriented domains
We saw in earlier chapters that there are different ways of defining the task that an
agent has to achieve. In task-oriented domains, the task(s) are explicitly defined
in the encounter: each agent is given a set of tasks to accomplish, associated
with which there is a cost. An agent attempts to minimize the overall cost of
Negotiation 147
where
e E E is the initial state of the environment; and
W : E x Ag -
R is a worth function, which assigns to each environment
state e E E and each agent i E A g a real number W(e, i) which represents
the value, or worth, to agent i of state e.
I write plans using the notation j : e l -- e2; the intuitive reading of this is that
the (joint) plan j can be executed in state e l , and when executed in this state, will
lead to state e2.
Suppose for the sake of argument that agent i operates alone in an environment
that is in initial state eo. What should this agent do? In this case, it does not need
to negotiate - it should simply pick the plan j;,, such that jQ,, can be executed
in state eo and, when executed, will bring about a state that maximizes the worth
148 Reaching Agreements
for agent i. Formally, jb,, will satisfy the following equation (Rosenschein and
Zlotkin, 1994, p. 156):
j;,, = arg max W ( i , e ) - C ( j ,i ) .
j:eo -e tJ
Operating alone, the utility that i obtains by executing the plan jQ,,represents
the best it can do. Turning to multiagent encounters, it may at first seem that
an agent can do no better than executing jOpt, but of course this is not true. An
agent can benefit from the presence of other agents, by being able to execute joint
plans - and hence bring about world states - that it would be unable to execute
alone. If there is no joint plan that improves on j&,for agent i, and there is no
interaction between different plans, then negotiation is not individual rational: i
may as well work on its own, and execute j&,. HOWmight plans interact? Suppose
my individual optimal plan for tomorrow involves using the family car to drive
to the golf course; my wife's individual optimal plan involves using the car to go
elsewhere. In this case, our individual plans interact with one another because
there is no way they can both be successfully executed. If plans interfere with one
another, then agents have no choice but to negotiate.
It may be fruitful to consider in more detail exactly what agents are negotiating
over in WODs. [Jnlike TODs, agents negotiating over WODs are not negotiating
over a single issue: they are negotiating over both the state that they wish to bring
about (which will have a different value for different agents), and over the means
by which they will reach this state.
Argumentation
The game-theoretic approaches to reaching agreement that we have seen so far
in this chapter have a number of advantages, perhaps the most important of
which are that we can prove some desirable properties of the negotiation protocols
we have considered. However, there are several disadvantages to such styles of
negotiation (Jennings et a!., 2001) as follows.
Positions cannot be justified. When humans negotiate, they justify their negoti-
ation stances. For example, if you attempt to sell a car to me, you may justify
the price with respect to a list of some of the features that the car has - for
example, a particularly powerful engine. In turn, I may justify my proposal for
a lower price by pointing out that I intend to use the car for short inner-city
journeys, rendering a powerful engine less useful. More generally, negotiating
using a particular game-theoretic technique may make it very hard to under-
stand how an agreement was reached. This issue is particularly important if we
intend to delegate tasks such as buying and selling goods to agents. To see why,
suppose you delegate the task of buying a car to your agent: after some time,
the agent returns, having purchased a car using your credit card. Reasonably
Argumentation 149
enough, you want to know how agreement was reached: Why did the agent pay
this much for this car? But if the agent cannot explain how the agreement was
reached in terms that you can easily understand and relate to, then you may find
the agreement rather hard to accept. Notice that simply pointing to a sequence
of complex equations will not count as an explanation for most people; nor will
the claim that 'the agreement was the best for you1.If agents are to act on our
behalf in such scenarios, then we will need to be able to trust and relate to the
decisions they make.
Positions cannot be changed. Game theory tends to assume that an agent's util-
ity function is fixed and immutable: it does not change as we negotiate. It could
be argued that from the point of view of an objec~ive,external, omniscient
observer, this is in one sense true. However, from our subjective, personal point
of view, our preferences certainly do change when we negotiate. Returning to
the car-buying example, when I set out to buy a car, I may initially decide that I
want a car with an electric sun roof. However, if I subsequently read that elec-
tric sun roofs are unreliable and tend to leak, then this might well change my
preferences.
These limitations of game-theoretic negotiation have led to the emergence of
argumentation-based negotiation (Sycara, 1989b; Parsons et at., 1998).Put crudely,
argumentation in a multiagent context is a process by which one agent attempts
to convince another of the truth (or falsity) of some state of affairs. The process
involves agents putting forward arguments for and against propositions, together
with justifications for the acceptability of these arguments.
The philosopher Michael Gilbert suggests that if we consider argumentation
as it occurs between humans, we can identify at least four different modes of
argument (Gilbert, 1994) as follows.
(1) Logical mode. The logical mode of argumentation resembles mathematical
proof. It tends to be deductive in nature ('if you accept that A and that A implies
3, then you must accept that 3'). The logical mode is perhaps the paradigm
example of argumentation. It is the kind of argument that we generally expect
(or at least hope) to see in courts of law and scientific papers.
(2) Emotional mode. The emotional mode of argumelitation occurs when appeals
are made to feelings, attitudes, and the like. An example is the 'how would you
feel if it happened to you' type of argument.
(3) Visceral mode. The visceral mode of argumentation is the physical, social
aspect of human argument. It occurs, for example, when one argumentation
participant stamps their feet to indicate the strength of their feeling.
(4) Kisceral mode. Finally, the kisceral mode of argumentation involves appeals
to the intuitive, mystical, or religious.
Of course, depending on the circumstances, we might not be inclined to accept
some of these modes of argument. In a court of law in most western societies, for
150 Reaching Agreements
example, the emotional and kisceral modes of argumentation are not permitted.
Of course, t h s does not stop lawyers trying to use them: one of the roles of a
judge is to rule such arguments unacceptable when they occur. Other societies,
in contrast, explicitly allow for appeals to be made to religious beliefs in legal
settings. Similarly, while we might not expect to see arguments based on emotion
accepted in a court of law, we might be happy to permit them when arguing with
our children or spouse.
Logic-based argumentation
The logical mode of argumentation might be regarded as the 'purest' or 'most
rational' kind of argument. In this subsection, I introduce a system of argumenta-
tion based upon that proposed by Fox and colleagues (Fox et al., 1992; Krause e t
al., 1995). This system works by constructing a series of logical steps (arguments)
for and against propositions of interest. Because this closely mirrors the way
that human dialectic argumentation (Jowett, 1875) proceeds, this system forms
a promising basis for building a framework for dialectic argumentation by which
agents can negotiate (Parsons and Jennings, 1996).
In classical logic, an argument is a sequence of inferences leading to a conch-
sion: we write A I- g, to mean that there is a sequence of inferences from premises
A that will allow us to establish proposition cp. Consider the simple database Al
which expresses some very familiar information in a Prolog-like notation in which
variables are capitalized and ground terms and predicate names start with small
letters:
huwzan(Socrates).
A1
human(X) a mortal(X).
The argument A l I-- m o r t a l ( S o c r a t e s ) may be correctly made from this
database because m o r t a l ( S o c r a t e s ) follows from A l given the usual logical
axioms and rules of inference of classical logic. Thus a correct argument simply
yields a conclusion which in this case could be paraphrased ' m o r t a l ( S o c r a t e s )
is true in the context of h u m a n ( S o c r a t e s ) and h u m a n ( X ) a m o r t a l ( X ) ' .
In the system of argumentation we adopt here, this traditional form of reason-
ing is extended by explicitly recording those propositions that are used in the
derivation. This makes it possible to assess the strength of a given argument by
examining the propositions on which it is based.
The basic form of arguments is as follows:
Database I- (Sentence, G r o u n d s ) ,
where
D a t a b a s e is a (possibly inconsistent) set of logical formulae;
Sentence is a logical formula known as the conclusion; and
Argumentation 151
(1) G r o u n d s _c D a t a b a s e ; and
(2) Sentence can be proved from G r o u n d s .
Defeat. Let ( c p l , TI) and (432, T2) be arguments from some database A. The argu-
ment ( p 2F2), can be defeated in one of two ways. Firstly, (cpl, rl ) rebuts ( p 2T2)
,
if cpl attacks cp2. Secondly, (cpl, T1) undercuts (cp;!, T2) if cpl attacks w for some
w E r2.
In which attack is defined as follows.
Attack. For any two propositions cp and tp, we say that cp attacks q if and only
if cp = l + .
152 Reaching Agreements
Consider the following set of formulae, which extend the example of Al with
information in common currency at the time of Plato:
h u m a n (Herucles)
f a t h e r (Herucles, Zeus)
f a t h e r (Apollo, Z e u s )
divine(X) a l m o r t a l ( X )
f a t h e r (X, Zeus) 3 d i v i n e (X)
l( f a t h e r ( X , Zeus) 3 d i v i n e ( X ) )
From this we can build the obvious argument, A r g l about H e r a c l e s ,
(mortal(Herucles),
{ h u m a n ( H e r a c l e s ) ,h u m u n ( X ) * m o r t a l ( X ) )),
as well as a rebutting argument A r g z ,
(lrnortul(Heracles),
{ f a t h e r ( H e r a c l e s , Zeus), f a t h e r ( X , Zeus) 3 divine(X),
divine(X) 3 lrnortal(X))).
The second of these is undercut by Arg3:
(l( f a t h e r ( X , Zeus) a divine(X)),
(l(father(X,Zeus) 3 divine(X)))).
The next step is to define an ordering over argument types, which approxi-
mately corresponds to increasing acceptability. The idea is that, when engaged in
argumentation, we intuitively recognize that some types of argument are more
'powerful' than others. For example, given database A = { p + q, p ] , the argu-
ments Argl = ( p v l p , 0) and Arg;! = (q, { p + q, p ) ) are both acceptable
members of A ( A ) . However, it is generally accepted that A r g l - a tautological
argument - is stronger than A r g 2 , for the simple reason that it is not possible
to construct a scenario in which the conclusion of A r g l is false. Any agent that
accepted classical propositional logic would have to accept A r g l (but an agent
that only accepted intuitionistic propositional logic would not). In contrast, the
argument for the conclusion of Arg2 depends on two other propositions, both of
which could be questioned.
In fact, we can identify five classes of argument type, which we refer to as Al
to A s , respectively. In order of increasing acceptability, these are as follows.
Al The class of all arguments that may be made from A.
A;! The class of all non-trivial arguments that may be made from A.
A3 The class of all arguments that may be made from A for which there are no
rebutting arguments.
Argumentation 153
A4 The class of all arguments that may be made from A for which there are no
undercutting arguments.
As The class of all tautological arguments that may be made from A.
There is an order, 3 , over the acceptability classes:
meaning that arguments in higher numbered classes are more acceptable than
arguments in lower numbered classes. The intuition is that there is less reason
for thinking that there is something wrong with them - because, for instance,
there is no argument which rebuts them. The idea that an undercut attack is less
damaging than a rebutting attack is based on the notion that an undercut allows
for another, undefeated, supporting argument for the same conclusion. This is
common in the argumentation literature (see, for example, Krause et al., 1995).
In the previous example, the argument
is in A5, while Argl and k g - , are mutually rebutting and thus in A?, whereas
i4rcq41
( - m o r t a l (apollo),
{ f ather(npol10, Zeus), f a t h e r (X, Zeus) * dizline(X),
d i v i n e ( X ) * l m o r t a l (X)I),
is in A4. This logic-based model of argumentation has been used in argumentation-
based negotiation systems (Parsons and Jennings, 1996; Parsons et al., 1998).The
basic idea is as follows. You are attempting to negotiate with a peer oi7erwho will
carry out a particular task. Then the idea is to argue for the other agent intending
to carry this out, i.e. you attempt to convince the other agent of the acceptability
of the argument that it should intend to carry out the task for you.
A dialogue has ended if there are no further moves possible. The winner of a
dialogue that has ended is the last agent to move. If agent 0 was the last agent to
move, then this means that agent 1 had no argument available to defeat 0's last
argument. If agent 1 was the last agent to move, then agent 0 had no argument
available to defeat 1's last argument. Viewed in this way, argument dialogues can
be seen as a game played between proposers and opponents of arguments.
Types of dialogue
Walton and Krabbe (1995, p. 66) suggest a typology of six different modes of dia-
logues, which are summarized in Table 7.1. The first (type 1) involves the 'canoni-
cal' form of argumentation, where one agent attempts to convince another of the
Argumentation 155
Abstract argumentation
There is another, more abstract way of looking at arguments than the view we have
adopted so far. In this view, we are not concerned with the internal structure of
156 Reaching Agreements
is there a conflict?
/YES
persuasion is settlement the goal'! is this a theoretical problem'? information seeking
YES,;
negotiation eristics inquiry deliberation
Figure 7.2 Determining the type of a dialogue.
individual arguments, but rather with the overall structure of the argument. We
can model such an abstract argument system .A as a pair (Dung, 1995):
where
X is a set of arguments (we are not concerned with exactly what members
of X are); and
- GX x X is a binary relation on the set of arguments, representing the
notion of attack.
I write x - y as a shorthand for ( x ,y ) E -. The expression x - y may be read
as
'argument x attacks argument y ' ;
'x is a counter-example of y'; or
- 'x is an attacker of y'.
Notice that, for the purposes of abstract argument systems, we are not concerned
with the contents of the set X, nor are we concerned with 'where the attack relation
comes from'. Instead, we simply look at the overall structure of the argument.
Given an abstract argument system, the obvious question is when an argument
in it is considered 'safe' or 'acceptable'. Similarly important is the notion of a
set of arguments being a 'defendable position', where such a position intuitively
represents a set of arguments that are mutually defensive, and cannot be attacked.
Such a set of arguments is referred to as being admissible.
There are different ways of framing t h s notion, and I will present just one of
them (from Vreeswijk and Prakken, 2000, p. 242). Given an abstract argument
-
system .A = (X, ), we have the following.
An argument x E X is attacked by a set of arguments Y E X if at least one
-
member of Y attacks x (i.e. if y x for some y E Y).
Argumentation 157
profits f+)
plant efficiency
(+)
1
production cost (-)
\ prices (-)
t
satisfaction (+) employment (-) economic concessions (-)
economic
\
uneconomic
/
automation (+)
t
subcontract (+)
/
wages (-)
\
fringes (-)
concearions (+) concessions (+)
t
wages (+)
Importance o f wage-goal1 i s 6 f o r un on 1
Searching companyl g o a l -graph. . .
Increase in wage-goal 1 by companyl w 71 r e s u l t i n
i n c r e a s e in economi c-concessi ons 1 a b o u r - c o s t l , p r o d u c t i o n - c o s t 1
Increase i n wage-goal1 by companyl w 11 r e s u l t i n
decrease i n p r o f i t s 1
To compensate, companyl can decrease f r i n g e - b e n e f i t s l ,
decrease employmentl, i n c r e a s e p l a n t - e f f i c i e n c y l ,
increase s a l e s l
Only decrease f r i n g e - b e n e f i t s l , decreases employmentl
v i o l ate goals o f union1
Importance o f f r i n g e - b e n e f i t s 1 i s 4 f o r u n i o n l
Importance o f employmentl i s 8 f o r u n i o n l
Since importance o f empl o y m e n t l > i m p o r t a n c e o f wage-goal 1
One p o s s i b l e argument f o u n d
Exercises
(1) [Class discussion.]
Pick real-world examples of negotiation with which you are familiar (buying a second-
hand car or house, for example). For these, identify what represents a 'deal'. Is the deal
single attribute or multiple attribute? Is it a task-oriented domain or a worth-oriented
domain? Or neither? Is it two agent or n agent? What represents a concession in such a
domain? Is a particular protocol used when negotiating? What are the rules?
(2) [Level 1 .]
Why are shills not a potential problem in Dutch, Vickrey, and first-price sealed-bid
auctions?
(3) [Level 2.1
With respect to the argument system in Figure 7.3, state with justification the status of
the arguments were not discussed in the text (i.e. a-q).
Communication
Communication has long been recognized as a topic of central importance in
computer science, and many formalisms have been developed for representing
the properties of communicating concurrent systems (Hoare, 1978; Milner, 1989).
Such formalisms have tended to focus on a number of key issues that arise when
dealing with systems that can interact with one another.
Perhaps the characteristic problem in communicating concurrent systems
research is that of synchronizing multiple processes, which was widely stud-
ied throughout the 1970s and 1980s (Ben-Ari, 1990). Essentially, two processes
(cf. agents) need to be synchronized if there is a possibility that they can interfere
with one another in a destructive way. The classic example of such interference is
the 'lost update' scenario. In this scenario, we have two processes, pl and p2, both
of whch have access to some shared variable v. Process pl begins to update the
value of v , by first reading it, then modifying it (perhaps by simply incrementing
the value that it obtained), and finally saving this updated value in v. But between
pl reading and again saving the value of v , process p2 updates v , by saving some
value in it. When pl saves its modified value of v , the update performed by p2
is thus lost, which is almost certainly not what was intended. The lost update
problem is a very real issue in the design of programs that communicate through
shared data structures.
So, if we do not treat communication in such a 'low-level' way, then how is com-
munication treated by the agent community? In order to understand the answer,
it is helpful to first consider the way that communication is treated in the object-
oriented programming community, that is, communication as method invocation.
Suppose we have a Java system containing two objects, 01 and 0 2 , and that 01 has
a publicly available method m l . Object 0 2 can communicate with 01 by invok-
ing method ml. In Java, this would mean 0 2 executing an instruction that looks
something like 01.ml(arg), where arg is the argument that 0 2 wants to cornrnu-
nicate to 01. But consider: whch object makes the decision about the execution of
164 Communication
method ml? Is it object 0 1 or object 0 2 ?In this scenario, object 0 1 has no control
over the execution of m l : the decision about whether to execute ml lies entirely
with 02.
Now consider a similar scenario, but in an agent-oriented setting. We have two
agents i and j, where i has the capability to perform action a, whch corresponds
loosely to a method. But there is no concept in the agent-oriented world of agent j
'invoking a method' on i. This is because i is an autonomous agent: it has control
over both its state and its behaviour. It cannot be taken for granted that agent i
will execute action a just because another agent j wants it to. Performing the
action a may not be in the best interests of agent i. The locus of control with
respect to the decision about whether to execute an action is thus very different
in agent and object systems.
In general, agents can neither force other agents to perform some action, nor
write data onto the internal state of other agents. This does not mean they can-
not communicate, however. What they can do is perform actions - cornmunica-
tive actions - in an attempt to influence other agents appropriately. For example,
suppose I say to you 'It is raining in London', in a sincere way. Under normal cir-
cumstances, such a communication action is an attempt by me to modify your
beliefs. Of course, simply uttering the sentence 'It is raining in London' is not
usually enough to bring about this state of affairs, for all the reasons that were
discussed above. You have control over your own beliefs (desires, intentions).
You may believe that I am notoriously unreliable on the subject of the weather,
or even that I am a pathological liar. But in performing the communication action
of uttering 'It is raining in London', I am attempting to change your internal state.
Furthermore, since t h s utterance is an action that I perform, I am performing it
for some purpose - presumably because I intend that you believe it is raining.
8.1.1 Austin
The theory of speech acts is generally recognized to have begun with the work
of the phlosopher John Austin (Austin, 1962). He noted that a certain class of
natural language utterances - hereafter referred to as speech acts - had the char-
acteristics of actions, in the sense that they change the state of the world in a way
Speech Acts 165
Searle
Austin's work was extended by John Searle in h s 1969 book Speech Acts
(Searle, 1969). Searle identified several properties that must hold for a speech
act performed between a hearer and a speaker to succeed. For example, consider
a request by SPEAKER to HEARER to perform ACTION.
(1) Normal 1 / 0 conditions. Normal 1/0 conditions state that HEARER is able to
hear the request (thus must not be deaf, etc.), the act was performed in normal
circumstances (not in a film or play, etc.), etc.
(2) Preparatory conditions. The preparatory conditions state what must be true
of the world in order that SPEAKER correctly choose the speech act. In t h s
case, HEARER must be able to perform ACTION, and SPEAKER must believe that
HEARER is able to perform ACTION. Also, it must not be obvious that HEARER
will do ACTION anyway.
The formalism chosen by Cohen and Perrault was the STRIPS notation, in whch
the properties of an action are characterized via preconditions and postconditions
(Fikes and Nilsson, 1971). The idea is very similar to Hoare logic (Hoare, 1969).
Cohen and Perrault demonstrated how the preconditions and postconditions of
speech acts such as request could be represented in a multimodal logic containing
operators for describing the beliefs, abilities, and wants of the participants in the
speech act.
Speech Acts 167
Consider the Request act. The aim of the Request act will be for a speaker
to get a hearer to perform some action. Figure 8.1 defines the Request act.
Two preconditions are stated: the 'cando.pr' (can-do preconditions), and 'want.pr'
(want preconditions). The cando.pr states that for the successful completion of
the Request, two conditions must hold. First, the speaker must believe that the
hearer of the Request is able to perform the action. Second, the speaker must
believe that the hearer also believes it has the ability to perform the action. The
want.pr states that in order for the Request to be successful, the speaker must
also believe it actually wants the Request to be performed. If the preconditions of
the Request are fulfilled, then the Request will be successful: the result (defined
by the 'effect' part of the definition) will be that the hearer believes the speaker
believes it wants some action to be performed.
While the successful completion of the Request ensures that the hearer is aware
of the speaker's desires, it is not enough in itself to guarantee that the desired
action is actually performed. This is because the definition of Request only mod-
els the illocutionary force of the act. It says nothing of the perlocutionary force.
What is required is a mediating act. Figure 8.1 gives a definition of CauseToWant,
which is an example of such an act. By t h s definition, an agent will come to believe
it wants to do somethmg if it believes that another agent believes it wants to do it.
This definition could clearly be extended by adding more preconditions, perhaps
to do with beliefs about social relationships, power structures, etc.
The I n f o r m act is as basic as Request. The aim of performing an I n f o r m
will be for a speaker to get a hearer to believe some statement. Like Request, the
definition of I n f o r m requires an associated mediating act to model the perlocu-
tionary force of the act. The cando.pr of I n f o r m states that the speaker must
believe cp is true. The effect of the act will simply be to make the hearer believe
that the speaker believes cp. The cando.pr of Convince simply states that the
hearer must believe that the speaker believes q .The effect is simply to make the
hearer believe q .
Figure 8.1 Definitions from Cohen and Perrault's plan-based theory of speech acts
C a u s e T o W a n t ( A 1 ,A ? , a )
C o n v i n c e ( A 1 ,A?, p )
1 KIF
I d l begin by describing the Knowledge Interchange Format - KIF (Genesereth
and Fikes, 1992).T h s language was originally developed with the intent of being
a common language for expressing properties of a particular domain. It was not
intended to be a language in whch messages themselves would be expressed, but
rather it was envisaged that the KIF would be used to express message content.
KIF is closely based on first-order logic (Enderton, 1972; Genesereth and Nilsson,
1987). (In fact, KIF looks very like first-order logic recast in a LISP-like notation;
to fully understand the details of t h s section, some understanding of first-order
logic is therefore helpful.) Thus, for example, by using KIF, it is possible for agents
to express
properties of thngs in a domain (e.g. 'Michael is a vegetarian' - Michael has
the property of being a vegetarian);
relationshps between thngs in a domain (e.g. 'Michael and Janine are mar-
ried' - the relationship of marriage exists between Michael and Janine);
general properties of a domain (e.g. 'everybody has a mother').
In order to express these thngs, KIF assumes a basic, fixed logical apparatus,
which contains the usual connectives that one finds in first-order logic: the binary
Boolean connectives and, o r , not, and so on, and the universal and existential
quantifiers fo r a l 1 and e x i s ts. In addition, KIF provides a basic vocabulary of
objects - in particular, numbers, characters, and strings. Some standard func-
!
tions and relations for these objects are also provided, for example the 'less than'
r[ relationshp between numbers, and the 'addition' function. A LISP-like notation is
also provided for handling lists of objects. Using this basic apparatus, it is pos-
sible to define new objects, and the functional and other relationships between
1 70 Communication
these objects. At t h s point, some examples seem appropriate. The following KIF
expression asserts that the temperature of m l is 83 Celsius:
(= (temperature m l ) ( s c a l a r 83 Celsius))
In t h s expression, = is equality: a relation between two objects in the domain;
temperature is a function that takes a single argument, an object in the domain
(in t h s case, m l ) , and scal a r is a function that takes two arguments. The =relation
is provided as standard in KIF, but both the temperature and s c a l a r functions
must be defined.
The second example shows how definitions can be used to introduced new
concepts for the domain, in terms of existing concepts. It says that an object
is a bachelor if this object is a man and is not married:
(def re1 a t i on bachel o r (?x) :=
(and (man ?x)
(not (married ?x))))
In t h s example, ?x is a variable, rather like a parameter in a programming lan-
guage. There are two relations: man and married, each of whch takes a single
argument. The := symbol means 'is, by definition'.
The next example shows how relationshps between individuals in the domain
can be stated - it says that any individual with the property of being a person also
has the property of being a mammal:
(def re1 a t i on person (?x) :=> (mammal ?x))
Here, both person and mammal are relations that take a single argument.
8.2.2 KQML
KQML is a message-based language for agent communication. Thus KQML defines
a common format for messages. A KQML message may crudely be thought of
as an object (in the sense of object-oriented programming): each message has
a performative (whch may be thought of as the class of the message), and a
number of parameters (attribute/value pairs, whch may be thought of as instance
variables).
Here is an example KQML message:
(ask-one
: content (PRICE I B M ?pri ce)
:receiver stock-server
:1anguage LPROLOG
:onto1 ogy NYSE-TICKS
1
The intuitive interpretation of t h s message is that the sender is aslung about
the price of IBM stock. The performative is ask-one, which an agent will use to
Agent Communication Languages 171
Parameter Meaning
ask a question of another agent where exactly one reply is needed. The various
other components of this message represent its attributes. The most important of
these is the : c o n t e n t field, which specifies the message content. In this case, the
content simply asks for the price of IBM shares. The : r e c e i v e r attribute speci-
fies the intended recipient of the message, the :1anguage attribute specifies that
the language in which the content is expressed is called LPROLOG (the recipient
is assumed to 'understand' LPROLOC), and the final :onto1 ogy attribute defines
the terminology used in the message - we will hear more about ontologies later
in t h s chapter. The main parameters used in KQML messages are summarized
in Table 8.1; note that different performatives require different sets of parame-
ters.
Several different versions of KQML were proposed during the 1990s, with dif-
ferent collections of performatives in each. In Table 8.2, I summarize the ver-
sion of KQML performatives that appeared in Finin el al. (1993); this version
contains a total of 41 performatives. In this table, S denotes the :sender of
the messages, R denotes the : r e c e i v e r , and C denotes the content of the mes-
sage.
To more fully understand these performatives, it is necessary to understand
the notion of a virtual knowledge base (VKB) as it was used in KQML. The idea
was that agents using KQML to communicate may be implemented using differ-
ent programming languages and paradigms - and, in particular, any information
that agents have may be internally represented in many different ways. No agent
i
can assume that another agent will use the same internal representation; indeed,
! no actual 'representation' may be present in an agent at all. Nevertheless, for the
; purposes of communication, it makes sense for agents to treat other agents as i f
i they had some internal representation of knowledge. Thus agents attribute knowl-
i edge to other agents; this attributed knowledge is known as the virtual knowledge
i base.
172 Communication
Performative Meaning
Dialogue (a)
(eval u a t e
:sender A : r e c e i v e r B
:language K I F :o n t o 1 ogy m o t o r s
:r e p l y - w i t h q l : c o n t e n t ( v a l ( t o r q u e m l ) ) )
(rep1 Y
:sender B : r e c e i v e r A
:language K I F : o n t o l o g y m o t o r s
: i n - r e p l y - t o q l : c o n t e n t (= ( t o r q u e m l ) ( s c a l a r 12 kgf)))
Dialogue (b)
( s t ream-about
: sender A : r e c e i v e r B
:language K I F : o n t o l o g y m o t o r s
: r e p l y - w i t h q l :c o n t e n t m l )
(tell
:sender B : r e c e i v e r A
: i n - r e p l y - t o q l : c o n t e n t (= ( t o r q u e m l ) ( s c a l a r 12 k g f ) ) )
(tell
:sender B : r e c e i v e r A
: in - r e p 1 y - t o q l : c o n t e n t (= ( s t a t u s m l ) normal))
(eos
:sender B : r e c e i v e r A
:in- r e p l y - t o q l )
--
Dialogue (c)
( a d v e r t i se
:sender A
:1anguage KQML : o n t o 1ogy K10
:c o n t e n t
( s u b s c r i be
:language KQML : o n t o l o g y K10
:content
( s t ream-about
:language K I F : o n t o l o g y motors
:content ml)))
( s u b s c r i be
:sender B : r e c e i v e r A
:reply-with s l
:c o n t e n t
( s t ream-about
:1anguage K I F : o n t o l o g y motors
: c o n t e n t ml))
(tell
:sender A : r e c e i v e r B
: i n - r e p l y - t o s l : c o n t e n t (= ( t o r q u e m l ) ( s c a l a r 12 k g f ) ) )
(tell
: sender A : r e c e i v e r B
: i n - r e p l y - t o s l : c o n t e n t (= ( s t a t u s m l ) normal))
(untell
:sender A : r e c e i v e r B
: i n - r e p l y - t o s l : c o n t e n t (= ( t o r q u e m l ) ( s c a l a r 12 k g f ) ) )
(tell
:sender A : r e c e i v e r B
: i n - r e p l y - t o s l : c o n t e n t (= ( t o r q u e m l ) (scalar 1 5 kgf)))
(eos
:sender A : r e c e i v e r B
:in - r e p l y - t o s l )
.- -- - -- . .-
The third (and most complex) dialogue, shown in Figure 8.3, shows how KQML
messages themselves can be the content of KQML messages. The dialogue begins
when agent A advertises to agent B that it is willing to accept subscriptions relat-
ing to ml. Agent B responds by subscribing to agent A with respect to ml. Agent A
then responds with sequence of messages about ml; as well as including t e l l mes-
sages, as we have already seen, the sequence includes an untel 1 message, to the
effect that the torque of m l is no longer 1 2 kgf, followed by a t e l l message indi-
cating the new value of torque. The sequence ends with an end of stream message.
Agent Communication Languages 175
The take-up of KQML by the multiagent systems community was significant, and
several KQML-based implementations were developed and distributed. Despite
thls success, KQML was subsequently criticized on a number of grounds as fol-
lows.
The basic KQML performative set was rather fluid - it was never tightly con-
strained, and so different implementations of KQML were developed that
could not, in fact, interoperate.
Transport mechanisms for KQML messages (i.e. ways of getting a message
from agent A to agent B) were never precisely defined, again making it hard
for different KQML-tallung agents to interoperate.
The semantics of KQML were never rigorously defined, in such a way that it
was possible to tell whether two agents claiming to be talking KQML were in
fact using the language 'properly'. The 'meaning' of KQML performatives was
only defined using informal, English language descriptions, open to different
interpretations. (I discuss this issue in more detail later on in this chapter.)
The language was missing an entire class of performatives - comrnissives, by
which one agent makes a commitment to another. As Cohen and Levesque
point out, it is difficult to see how many multiagent scenarios could be imple-
mented without comrnissives, which appear to be important if agents are to
coordinate their actions with one another.
The performative set for KQML was overly large and, it could be argued,
rather ad hoc.
These criticisms - amongst others - led to the development of a new, but rather
closely related language by the FIPA consortium.
I for FIPA ACL messages closely resembles that of KQML. Here is an example of a
FIPA ACL message (from PIPA, 1 9 9 9 p. 10):
(inform
:sender agent1
:r e c e i v e r agent2
:content ( p r i c e good2 150)
:language sl
: onto1 ogy h p l - a u c t i on
1
176 Communication
accept-proposal
agree
cancel
cf P
c o n f i rrn
d isconfi r m
failure
inform
in f o r m - i f
in f o r m - r e f
not-understood
propagate
propose
Proxy
query-if
query- r e f
refuse
r ej e c t - p r o p o s a l
request
request-when
request-whenever
subscribe
As should be clear from this example, the FIPA communication language is similar
to KQML: the structure of messages is the same, and the message attribute fields
are also very similar. The relationship between the FIPA ACL and KQML is dis-
cussed in FIPA (1999, pp. 68, 69). The most important difference between the two
languages is in the collection of performatives they provide. The performatives
provided by the FIPA communication language are categorized in Table 8.3.
Informally, these performatives have the following meaning.
accept-proposal The accept-proposal performative allows an agent to state
that it accepts a proposal made by another agent.
agree An accept performative is used by one agent to indicate that it has acqui-
esced to a request made by another agent. It indicates that the sender of
the agree message intends to carry out the requested action.
cancel A cancel performative is used by an agent to follow u p to a previous
request message, and indicates that it no longer desires a particular action
to be carried out.
cfp A cfp (call for proposals) performative is used to initiate negotiation
between agents. The content attribute of a c f p message contains both an
Agent Communication Languages 177
action (e.g. 'sell me a car') and a condition (e.g. 'the price of the car is less
than US$10 000'). Essentially, it says 'here is an action that I wish to be car-
ried out, and here are the terms under which I want it to be carried out - send
me your proposals'. (We will see in the next chapter that the c f p message is
a central component of task-sharing systems such as the Contract Net.)
c o n f i r m The c o n f ir m performative allows the sender of the message to confirm
the truth of the content to the recipient, where, before sending the message,
the sender believes that the recipient is unsure about the truth or otherwise
of the content.
d i s c o n f i r m Similar to c o n f i rm, but t h s performative indicates to a recipient
that is unsure as to whether or not the sender believes the content that the
content is in fact false.
f a i 1u r e T h s allows an agent to indicate to another agent that an attempt to
perform some action (typically, one that it was previously r e q u e s t e d to
perform) failed.
i n f o r m Along with request, the i n f o r m performative is one of the two most
important performatives in the FIPA ACL. It is the basic mechanism for com-
municating information. The content of an in f o r m performative is a state-
ment, and the idea is that the sender of the i n f o r m wants the recipient to
believe this content. Intuitively, the sender is also implicitly stating that it
believes the content of the message.
inf orm-i f An in f orm-i f implicitly says either that a particular statement is
true or that it is false. Typically, an inform-i f performative forms the con-
tent part of a message. An agent will send a r e q u e s t message to another
agent, with the content part being an in f o r m - i f message. The idea is that
the sender of the r e q u e s t is saying 'tell me if the content of the i n f o r m - i f
is either true or false'.
inform-ref The idea of i n f o r m - r e f is somewhat similar to that of i n f o r m - i f :
the difference is that rather than asking whether or not an expression is true
or false, the agent asks for the value of an expression.
i
not-understood T h s performative is used by one agent to indicate to another
t
agent that it recognized that it performed some action, but did not un-
I
on Cohen and Levesque's theory of speech acts as rational action (Cohen and
Levesque, 1990b),but in particular on Sadek's enhancements to this work (Bretier
and Sadek, 1997). The semantics were given with respect to a formal language
called SL. T h s language allows one to represent beliefs, desires, and uncertain
beliefs of agents, as well as the actions that agents perform. The semantics of the
FIPA ACL map each ACL message to a formula of SL, w h c h defines a constraint
that the sender of the message must satisfy if it is to be considered as conforming
to the FIPA ACL standard. FIPA refers to t h s constraint as the feasibility condi-
tion. The semantics also map each message to an SL-formula that defines the
rational effect of the action - the 'purpose' of the message: what an agent will be
attempting to acheve in sending the message (cf. perlocutionary act). However,
in a society of autonomous agents, the rational effect of a message cannot (and
should not) be guaranteed. Hence conformance does not require the recipient
of a message to respect the rational effect part of the ACL semantics - only the
feasibility condition.
As I noted above, the two most important communication primitives in the
FIPA languages are inform and request. In fact, all other performatives in FIPA
are defined in terms of these performatives. Here is the semantics for inform
(FIPA, 1999, p. 25):
The B i p means 'agent i believes p'; B i f i p means that 'agent i has a definite
opinion one way or the other about the truth or falsity of p ' ; and CTifip means
that agent i is 'uncertain' about p. Thus an agent i sending an i n f o r m message
with content p to agent j will be respecting the semantics of the FIPA ACL if it
believes p, and it is not the case that it believes of j either that j believes whether
q is true or false, or that j is uncertain of the truth or falsity of p . If the agent is
successful in performing the i n f o rm, then the recipient of the message - agent j -
will believe p .
The semantics of request are as follows2:
(i, r e q u e s t ( j ,a ) )
feasibility precondition: BiAgent (a,j) A 1 BiIjDone ( a )
rational effect: D o n e ( a ) . (8.2)
The SL expression Agent (a,j) means that the agent of action a is j (i.e. j is the
agent who performs a);and D o n e ( a ) means that the action a has been done.
Thus agent i requesting agent j to perform action a means that agent i believes
that the agent of a is j (and so it is sending the message to the right agent), and
'1n the interests of comprehension, I have simplified the semantics a little.
180 Communication
agent i believes that agent j does not currently intend that LY is done. The rational
effect - what i wants to acheve by t h s - is that the action is done.
One key issue for t h s work is that of semantic conformance testing. The con-
formance testing problem can be summarized as follows (Wooldridge, 1998). We
are given an agent, and an agent communication language with some well-defined
semantics. The aim is to determine whether or not the agent respects the seman-
tics of the language whenever it communicates. Syntactic conformance testing
is of course easy - the difficult part is to see whether or not a particular agent
program respects the semantics of the language.
The importance of conformance testing has been recognized by the ACL com-
munity (FIPA, 1999, p. 1). However, to date, little research has been carried out
either on how verifiable communication languages might be developed, or on how
existing ACLs might be verified. One exception is (my) Wooldridge (1998),where
the issue of conformance testing is discussed from a formal point of view: I point
out that ACL semantics are generally developed in such a way as to express con-
straints on the senders of messages. For example, the constraint imposed by the
semantics of an 'inform' message might state that the sender believes the mes-
sage content. This constraint can be viewed as a specification. Verifying that an
agent respects the semantics of the agent communication language then reduces
to a conventional program verification problem: show that the agent sending the
message satisfies the specification given by the communication language seman-
tics. But to solve this verification problem, we would have to be able to talk about
the mental states of agents - what they believed, intended and so on. Given an
agent implemented in (say) Java, it is not clear how this might be done.
t7
Library
Server - remote applications
f
1 t
Translators
1E
C
stand alone application
Figure 8.4 Architecture of the Ontolingua server
vides a common set of rules for enabling Web servers and clients to commu-
nicate with one another, and a format for documents called (as I am sure you
know!) the Hypertext Markup Language (HTML). Now HTML essentially defines
a grammar for interspersing documents with markup commands. Most of these
markup commands relate to document layout, and thus give indications to a Web
browser of how to display a document: which parts of the document should be
treated as section headers, emphasized text, and so on. Of course, markup is not
restricted to layout information: programs, for example in the form of JavaScript
code, can also be attached. The grammar of HTML is defined by a Document
Type Declaration (DTD).A DTD can be thought of as being analogous to the for-
mal grammars used to define the syntax of programming languages. The HTML
DTD thus defines what constitutes a syntactically acceptable HTML document. A
DTD is in fact itself expressed in a formal language - the Standard Generalized
Markup Language (SGML, 2001). SGML is essentially a language for defining other
languages.
Now, to all intents and purposes, the HTML standard is fixed, in the sense that
you cannot arbitrarily introduce tags and attributes into HTML documents that
were not defined in the HTML DTD. But this severely limits the usefulness of the
Web. To see what I mean by this, consider the following example. An e-commerce
company selling CDs wishes to put details of its prices on its Web page. Using
conventional HTML techniques, a Web page designer can only markup the docu-
ment with layout information (see, for example, Figure 8.5(a)).But this means that
a Web browser - or indeed any program that looks at documents on the Web - has
no way of knowing which parts of the document refer to the titles of CDs, which
refer to their prices, and so on. Using XML it is possible to define new markup
tags - and so, in essence, to extend HTML. To see the value of this, consider
Figure 8.5(b),which shows the same information as Figure 8.5(a),expressed using
new tags ( c a t a l ogue, product, and so on) that were defined using XML. Note that
new tags such as these cannot be arbitrarily introduced into HTML documents:
they must be defined. The way they are defined is by writing an XML DTD: thus
XML, like SGML, is a language for defining languages. (In fact, XML is a subset of
SGML.)
I hope it is clear that a computer program would have a much easier time under-
standing the meaning of Figure 8.5(b) than Figure 8.5(a). In Figure 8.5(a), there is
nothmg to help a program understand which part of the document refers to the
price of the product, which refers to the title of the product, and so on. In contrast,
Figure 8.5(b) makes all this explicit.
XML was developed to answer one of the longest standing critiques of the Web:
the lack of semantic markup. Using languages like XML, it becomes possible to
add information to Web pages in such a way that it becomes easy for computers
not simply to display it, but to process it in meaningful ways. This idea led Tim
Berners-Lee, widely credited as the inventor of the Web, to develop the idea of the
semantic Web.
Coordination Languages
(b) XML
I have a dream for the Web [in which computers] become capable of
analysing all the data on the Web - the content, links, and transac-
tions between people and computers. A 'Semantic Web', which should
make this possible, has yet to emerge, but when it does, the day-to-day
mechanisms of trade, bureaucracy and our daily lives will be handled
by machines talking to machines. The 'intelligent agents' people have
touted for ages will finally materialise.
(Berners-Lee, 1999, pp. 169, 170)
In an attempt to realize this vision, work has begun on several languages and
tools - notably the Darpa Agent Markup Language (DAML, 2001), which is based
on XML. A fragment of a DAML ontology and knowledge base (from the DAML
version of the CIA world fact book (DAML, 2001)) is shown in Figure 8.6.
Coordination Languages
One of the most important precursors to the development of multiagent systems
was the blackboard model (Engelmore and Morgan, 1988). Initially developed as
184 Communication
< r d f : D e s c r i p t i o n r d f : ID="UNITED-KINGDOMu>
< r d f : t y p e rdf:resource="GEOREF"/>
<HAS-TOTAL-AREA>
(* 2 4 4 8 2 0 S q u a r e - K i l o m e t e r )
</HAS-TOTAL-AREA>
<HAS-LAND-AREA>
(* 2 4 1 5 9 0 S q u a r e - K i l o m e t e r )
</HAS-LAND-AREA>
<HAS-COMPARATIVE-AREA-DOC>
s l i g h t l y s m a l l e r than O r e g o n
</HAS-COMPARATIVE-AREA-DOC>
<HAS-BIRTH-RATE>
13.18
</HAS-BIRTH-RATE>
<HAS-TOTAL-BORDER-LENGTH>
(?: 360 Kilometer)
</HAS-TOTAL-BORDER-LENGTH>
<HAS-BUDGET-REVENUES>
(* 3 . 2 5 5 E 1 1 U s - D o l l a r s )
</HAS-BUDGET-REVENUES>
<HAS-BUDGET-EXPENDITURES>
(* 4 . O O g E l l Us-Do1 1a r s )
</HAS-BUDGET-EXPENDITURES>
<HAS-BUDGET-CAPITAL-EXPENDITURES>
(* 3 . 3 E l O U s - D o 1 1a r s )
</HAS-BUDGET-CAPITAL-EXPENDITURES
<HAS-CLIMATE-DOC>
m o r e than h a l f o f t h e d a y s a r e o v e r c a s t
</HAS-CLIMATE-DOC>
<HAS-COASTLINE-LENGTH>
(* 1 2 4 2 9 K i l o m e t e r )
</HAS-COASTLINE-LENGTH>
<HAS-CONSTITUTION-DOC>
u n w r i t t e n ; p a r t l y s t a t u t e s , p a r t l y common law
</HAS-CONSTITUTION-DOC>
</rdf: D e s c r i p t ion>
F i g u r e 8.6 Some facts about the UK, expressed in DAML.
part of the Hearsay speech understanding project, the blackboard model proposes
that group problem solving proceeds by a group of 'knowledge sources' (agents)
observing a shared data structure known as a blackboard: problem solving pro-
ceeds as these knowledge sources contribute partial solutions to the problem.
In the 1980s, an interesting variation on the blackboard model was proposed
within the programming language community. This variation was called Linda
(Gelernter, 1985; Carriero and Gelernter, 1989).
Strictly speakmg, Linda is not a programming language. It is the generic name
given to a collection of programming language constructs, whch can be used to
Coordination Languages 185
Operation Meaning
1980s. Two of the best-known formalisms developed in this period are Tony
Hoare's Communicating Sequential Processes (CSPs)(Hoare, 1978),and Robin Mil-
ner's Calculus of Communicating Systems (CCS)(Milner, 1989).Temporal logic has
also been widely used for reasoning about concurrent systems - see, for example,
Pnueli (1986)for an overview. A good reference, which describes the key problems
in concurrent and distributed systems, is Ben-Ari (1990).
The plan-based theory of speech acts developed by Cohen and Perrault made
speech act theory accessible and directly usable to the artificial intelligence com-
munity (Cohen and Perrault, 1979). In the multiagent systems community, this
work is arguably the most influential single publication on the topic of speech
act-like communication. Many authors have built on its basic ideas. For example,
borrowing a formalism for representing the mental state of agents that was devel-
oped by Moore (1990),Douglas Appelt was able to implement a system that was
capable of planning to perform speech acts (Appelt, 1982, 1985).
Many other approaches to speech act semantics have appeared in the literature.
For example, Perrault (1990) described how Reiter's default logic (Reiter, 1980)
could be used to reason about speech acts. Appelt gave a critique of Perrault's
work (Appelt and Konolige, 1988, pp. 167, 168),and Konolige proposed a related
technique using hierarchic auto-epistemic logic (HAEL) (Konolige, 1988) for rea-
soning about speech acts. Galliers emphasized the links between speech acts and
AMG belief revision (Gardenfors, 1988): she noted that the changes in a hearer's
state caused by a speech act could be understood as analogous to an agent revis-
ing its beliefs in the presence of new information (Galliers, 1991).Singh developed
a theory of speech acts (Singh, 1991c, 1993) using his formal framework for rep-
resenting rational agents (Singh, 199Oa,b, 1991a,b, 1994, l998b; Singh and Asher,
1991).He introduced a predicate c o m m ( i ,j, m ) to represent the fact that agent i
communicates message m to agent j, and then used this predicate to define the
semantics of assertive, directive, commissive, and permissive speech acts.
Dignum and Greaves (2000) is a collection of papers on agent communica-
tion languages. As I mentioned in the main text of the chapter, a number of
KQML implementations have been developed: well-known examples are InfoS-
leuth (Nodine and Unruh, 1998),KAoS (Bradshaw et al., 1997) and JATLite (Jeon e t
Coordination Languages 187
al., 2000)). Several FIPA implementations have also been developed, of which the
Java-based Jade system is probably the best known (Poggi and nmassa, 2001).
A critique of KIF was published as Ginsberg (1991), while a critique of KQML
appears in Cohen and Levesque (1995). A good general survey of work on ontolo-
gies (up to 1996) is Uschold and Gruninger (1996). There are many good online
references to XML, DAML and the like: a readable published reference is Decker et
a!. (2000). The March/April 2001 issue of IEEE Intelligent Systems magazine con-
tained a useful collection of articles on the semantic web (Fensel and Musen, 2001),
agents in the semantic Web (Hendler, 2001), and the OIL language for ontologies
on the semantic Web (Fensel et al., 2001).
Recently, a number of proposals have appeared for communication languages
with a verifiable semantics (Singh, 1998a; Pitt and Mamdani, 1999; Wooldridge,
1999). See Labrou et al. (1999) for a discussion of the state of the art in agent
communication languages as of early 1999.
Coordination languages have been the subject of much interest by the theoret-
ical computer science community: a regular conference is now held on the sub-
ject, the proceedings of which were published as Ciancarini and Hanhn (1996).
Interestingly, the Linda model has been implemented in the JavaSpaces package
(Freeman et al., 1999),m a h n g it possible to use the model with Java/JINI systems
(Oaks and Wong, 2000).
Exercises
(1) [Class discussion.]
What are the potential advantages and disadvantages of the use of agent communication
languages such as KQML or FIPA, as compared with (say) method invocation in object-
oriented languages? If you are familiar with distributed object systems like the Java RMI
paradigm, then compare the benefits of the two.
( 2 ) [Level 2.1
Using the ideas of Cohen and Perrault's plan-based theory of speech acts, as well as the
semantics of FIPA's r e q u e s t and inform performatives, try to give a semantics to other
FIPA performatives.
Working
Together
In the three preceding chapters, we have looked at the basic theoretical principles
of multiagent encounters and the properties of such encounters. We have also
seen how agents might reach agreements in encounters with other agents, and
looked at languages that agents might use to communicate with one another.
So far, however, we have seen nothing of how agents can work together. In this
chapter, we rectify this. We will see how agents can be designed so that they can
work together effectively. As I noted in Chapter 1, the idea of computer systems
worlung together may not initially appear to be very novel: the term 'cooperation'
is frequently used in the concurrent systems literature, to describe systems that
must interact with one another in order to carry out their assigned tasks. There are
two main distinctions between multiagent systems and 'traditional' distributed
systems as follows.
Agents in a multiagent system may have been designed and implemented
by different individuals, with different goals. They therefore may not share
common goals, and so the encounters between agents in a multiagent system
more closely resemble games, where agents must act strategically in order
to achieve the outcome they most prefer.
Because agents are assumed to be acting autonomously (and so making deci-
sions about what to do a t run time, rather than having all decisions hard-
wired in at design time), they must be capable of dynamically coordinating
their activities and cooperating with others. In traditional distributed and
concurrent systems, coordination and cooperation are typically hardwired
in at design time.
190 Working Together
Working together involves several different lunds of activities, that we will inves-
tigate in much more detail throughout this chapter, in particular, the sharing both
of tasks and of information, and the dynamic (i.e. run-time) coordination of multi-
agent activities.
Historically, most work on cooperative problem solving has made the benev-
olence assumption: that the agents in a system implicitly share a common goal,
and thus that there is no potential for conflict between them. T h s assumption
implies that agents can be designed so as to help out whenever needed, even if
it means that one or more agents must suffer in order to do so: intuitively, all
that matters is the overall system objectives, not those of the individual agents
within it. The benevolence assumption is generally acceptable if all the agents in
a system are designed or 'owned' by the same organization or individual. It is
important to emphasize that the ability to assume benevolence greatly simplifies
the designer's task. If we can assume that all the agents need to worry about is
the overall utility of the system, then we can design the overall system so as to
optimize this.
In contrast to work on distributed problem solving, the more general area
of multiagent systems has focused on the issues associated with societies of
self-interested agents. Thus agents in a multiagent system (unlike those in typ-
ical distributed problem-solving systems), cannot be assumed to share a com-
mon goal, as they will often be designed by different individuals or organiza-
tions in order to represent their interests. One agent's interests may therefore
conflict with those of others, just as in human societies. Despite the potential
for conflicts of interest, the agents in a multiagent system will ultimately need
to cooperate in order to achieve their goals; again, just as in human societies.
Cooperative Distributed Problem Solving 191
1987; Durfee, 1988; Gasser and Hill, 1990; Goldman and Rosenschein, 1993; Jen-
nings, 1993a; Weifi, 1993).
The main issues to be addressed in CDPS include the following.
How can a problem be divided into smaller tasks for distribution among
agents?
How can a problem solution be effectively synthesized from sub-problem
results?
How can the overall problem-solving activities of the agents be optimized
so as to produce a solution that maximizes the coherence metric?
What techniques can be used to coordinate the activity of the agents, so
avoiding destructive (and thus unhelpful) interactions, and maximizing
effectiveness (by exploiting any positive interactions)?
In the remainder of this chapter, we shall see some techniques developed by the
multiagent systems community for addressing these concerns.
Task 1 -rx
Task 1.1 Task 1.2 Task 1.3
- .
Figure 9.2 (a)Task sharing and (b)result sharing. In task sharing, a task is decomposed
into sub-problemsthat are allocated to agents, while in result sharing, agents supply each
other with relevant information, either proactively or on demand.
Given this general framework for CDPS, there are two specific cooperative
problem-solving activities that are likely to be present: task sharing and result
sharing (Smith and Davis, 1980) (see Figure 9.2).
Task sharing. Task sharing takes place when a problem is decomposed to smaller
sub-problems and allocated to different agents. Perhaps the key problem to be
solved in a task-sharing system is that of how tasks are to be allocated to indi-
vidual agents. If all agents are homogeneous in terms of their capabilities (cf. the
discussion on parallel problem solving, above), then task sharing is straightfor-
ward: any task can be allocated to any agent. However, in all but the most trivial
of cases, agents have very different capabilities. In cases where the agents are
really autonomous - and can hence decline to carry out tasks (in systems that
do not enjoy the benevolence assumption described above), then task alloca-
tion will involve agents reaching agreements with others, perhaps by using the
techniques described in Chapter 7.
Result sharing. Result sharing involves agents sharing information relevant to
their sub-problems. T h s information may be shared proactively (one agent
sends another agent some information because it believes the other will be
interested in it), or reactively (an agent sends another information in response
to a request that was previously sent - cf. the s u b s c r i b e performatives in the
agent communication languages discussed earlier).
In the sections that follow, I shall discuss task sharing and result sharing in more
detail.
I have a problem
A A3
(a) Recognizing
the problem T (b) Task announcement
[A] node that generates a task advertises existence of that task to other
nodes in the net with a task announcement, then acts as the manager
of that task for its duration. In the absence of any information about
the specific capabilities of the other nodes in the net, the manager is
forced to issue a general broadcast to all other nodes. If, however, the
manager possesses some knowledge about whch of the other nodes
in the net are likely candidates, then it can issue a limited broadcast to
just those candidates. Finally, if the manager knows exactly which of
the other nodes in the net is appropriate, then it can issue a point-to-
point announcement. As work on the problem progresses, many such
task announcements will be made by various managers.
Nodes in the net listen to the task announcements and evaluate
them with respect to their own specialized hardware and software
resources. When a task to which a node is suited is found, it submits
a bid. A bid indicates the capabilities of the bidder that are relevant to
the execution of the announced task. A manager may receive several
such bids in response to a single task announcement; based on the
information in the bids, it selects the most appropriate nodes to exe-
cute the task. The selection is communicated to the successful bidders
through an award message. These selected nodes assume responsibil-
196 Working I'ogether
ity for execution of the task, and each is called a contractor for that
task.
After the task has been completed, the contractor sends a report to
the manager. (Smith, 1980b, pp. 60, 61)
[This] normal contract negotiation process can be simplified in
some instances, with a resulting enhancement in the efficiency of the
protocol. If a manager knows exactly which node is appropriate for
the execution of a task, a directed contract can be awarded. Thls dif-
fers from the announced contract in that no announcement is made
and no bids are submitted. Instead, an award is made directly. In such
cases, nodes awarded contracts must acknowledge receipt, and have
the option of refusal.
In addition to describing the various messages that agents may send, Smith
describes the procedures to be carried out on receipt of a message. Briefly, these
procedures are as follows (see Smith (1980b, pp. 96-102) for more details).
(1) Task announcement processing. On receipt of a task announcement, an
agent decides if it is eligible for the task. It does this by looking at the eligi-
bility specification contained in the announcement. If it is eligible, then details
of the task are stored, and the agent will subsequently bid for the task.
(2) Bid processing. Details of bids from would-be contractors are stored by
(would-be)managers until some deadline is reached. The manager then awards
the task to a single bidder.
(3) Award processing. Agents that bid for a task, but fail to be awarded it, simply
delete details of the task. The successful bidder must attempt to expedite the
task (whch may mean generating new sub-tasks).
(4) Request and inform processing. These messages are the simplest to handle.
A request simply causes an inform message to be sent to the requestor, con-
taining the required information, but only if that information is immediately
available. (Otherwise, the requestee informs the requestor that the information
Result Sharing 197
I
1 is unknown.) An inform message causes its content to be added to the recipi-
I
l
ent's database. It is assumed that at the conclusion of a task, a contractor will
send an information message to the manager, detailing the results of the expe-
dited task1.
E
Despite (or perhaps because of) its simplicity, the Contract Net has become the
most implemented and best-studied framework for distributed problem solving.
contained an entry for the modelling agent and each agent that the modelling
agent might communicate with (its acquaintances). Each entry contained two
important attributes as follows.
Skills. This attribute is a set of identifiers denoting hypotheses which the agent
has the expertise to establish or deny. The slulls of an agent will correspond
roughly to root nodes of the inference networks representing the agent's
domain expertise.
Interests. This attribute is a set of identifiers denoting hypotheses for whch thc
agent requires the truth value. It may be that an agent actually has the expertise
to establish the truth value of its interests, but is nevertheless 'interested' in
them. The interests of an agent will correspond roughly to leaf nodes of the
inference networks representing the agent's domain expertise.
Messages in FELINE were triples, consisting of a sender, receiver, and contents.
The contents field was also a triple, containing message type, attribute, and value.
Agents in FELINE communicated using three message types as follows (the system
predated the KQML and FIPA languages discussed in Chapter 8).
Request. If an agent sends a request, then the attribute field will contain an iden-
tifier denoting a hypothesis. It is assumed that the hypothesis is one which lies
within the domain of the intended recipient. A request is assumed to mean that
the sender wants the receiver to derive a truth value for the hypothesis.
Response. If an agent receives a request and manages to successfully derive a
truth value for the hypothesis, then it will send a response to the originator of
the request. The attribute field will contain the identifier denoting the hypoth-
esis: the value field will contain the associated truth value.
Inform. The attribute field of an inform message will contain an identifier denot-
ing a hypothesis. The value field will contain an associated truth value. An
inform message will be unsolicited; an agent sends one if it thinks the recipient
will be 'interested' in the hypothesis.
To understand how problem solving in FELINE worked, consider goal-driven
problem solving in a conventional rule-based system. Typically, goal-driven rea-
soning proceeds by attempting to establish the truth value of some hypothesis.
If the truth value is not known, then a recursive descent of the inference network
associated with the hypothesis is performed. Leaf nodes in the inference network
typically correspond to questions which are asked of the user, or data that is
acquired in some other way. Within FELINE, t h s scheme was augmented by the
following principle. When evaluating a leaf node, if it is not a question, then the
environment model was checked to see if any other agent has the node as a 'slull'.
If there was some agent that listed the node as a slull, then a request was sent
to that agent, requesting the hypothesis. The sender of the request then waited
until a response was received; the response indicates the truth value of the node.
Handling Inconsistency 199
Handling Inconsistency
One of the major problems that arises in cooperative activity is that of inconsis-
tencies between different agents in the system. Agents may have inconsistencies
with respect to both their beliefs (the information they hold about the world),
and their goals/intentions (the things they want to achieve). As I indicated ear-
lier, inconsistencies between goals generally arise because agents are assumed to
be autonomous, and thus not share common objectives. Inconsistencies between
the beliefs that agents have can arise from several sources. First, the viewpoint
that agents have will typically be limited - no agent will ever be able to obtain
a complete picture of their environment. Also, the sensors that agents have may
be faulty, or the information sources that the agent has access to may in turn be
faulty.
In a system of moderate size, inconsistencies are inevitable: the question is how
to deal with them. Durfee et al. (1989a) suggest a number of possible approaches
to the problem as follows.
Do not allow it to occur - or at least ignore it. This is essentially the approach
of the Contract Net: task sharing is always driven by a manager agent, who
has the only view of the problem that matters.
Resolve inconsistencies through negotiation (see Chapter 7). While this may
be desirable in theory, the communication and computational overheads
incurred suggest that it will rarely be possible in practice.
= Build systems that degrade gracefully in the presence of inconsistency.
The third approach is clearly the most desirable. Lesser and Corkill (1981)refer to
systems that can behave robustly in the presence of inconsistency as functionally
uccura te/coopera rive (FA/C):
200 Working Together
Coordination
Perhaps the defining problem in cooperative working is that of coordination. The
coordination problem is that of managing inter-dependencies between the activ-
ities of agents: some coordination mechanism is essential if the activities that
agents can engage in can interact in any way. How might two activities interact?
Consider the following real-world examples.
You and I both want to leave the room, and so we independently walk
towards the door, which can only fit one of us. I graciously permit you to
leave first.
In t h s example, our activities need to be coordinated because there is a
resource (the door) which we both wish to use, but w h c h can only be used
by one person at a time.
Coordination 201
consumable
resource
,resource
/
negative non-consumable
relationships resource
\ incompatibility
! rnultiagent plan
i relationships
8
request
1
(explicit)
\ positive
relationships
\ non-requested
(implicit)
Figure 9.4 Von Martial's typology of coordination relationships.
the preceding chapters). von Martial (1990, p. 112) &stinguishes three types of
non-requested relationship as follows.
The action equality relationship. We both plan to perform an identical action,
and by recognizing this, one of us can perform the action alone and so save the
other effort.
The consequence relationship. The actions in my plan have the side-effect of
achieving one of your goals, thus relieving you of the need to explicitly achieve
it.
The favour relationship. Some part of my plan has the side effect of contributing
to the achievement of one of your goals, perhaps by making it easier (e.g. by
achieving a precondition of one of the actions in it).
Coordination in multiagent systems is assumed to happen a t run time, that is,
the agents themselves must be capable of recognizing these relationships and,
where necessary, managing them as part of their activities (von Martial, 1992).
This contrasts with the more conventional situation in computer science, where
a designer explicitly attempts to anticipate possible interactions in advance, and
designs the system so as to avoid negative interactions and exploit potential pos-
itive interactions.
In the sections that follow, I present some of the main approaches that have
been developed for dynamically coordinating activities.
generate a plan for the entire problem. It is global because agents form non-local
plans by exchanging local plans and cooperating to acheve a non-local view of
problem solving.
Partial global planning involves three iterated stages.
(1) Each agent decides what its own goals are, and generates short-term plans
in order to achieve them.
(2) Agents exchange information to determine where plans and goals interact.
(3) Agents alter local plans in order to better coordinate their own activities.
In order to prevent incoherence during these processes, Durfee proposed the use
of a meta-level structure, which guided the cooperation process within the sys-
tem. The meta-level structure dictated which agents an agent should exchange
information with, and under what conditions it ought to do so.
The actions and interactions of a group of agents were incorporated into a data
structure known as a partial global plan. This data structure will be generated
cooperatively by agents exchanging information. It contained the following prin-
ciple attributes.
Objective. The objective is the larger goal that the system is working towards.
Activity maps. An activity map is a representation of what agents are actually
doing, and what results will be generated by their activities.
Solution construction graph. A solution construction graph is a representation
of how agents ought to interact, what information ought to be exchanged, and
when, in order for the system to successfully generate a result.
Keith Decker extended and refined the PGP coordination mechanisms in his TLMS
testbed (Decker, 1996);this led to what he called generalized partial global plan-
ning (GPGP - pronounced 'gee pee gee pee') (Decker and Lesser, 1995).GPGP makes
use of five techniques for coordinating activities as follows.
Updating non-localviewpoints. Agents have only local views of activity, and so
sharing information can help them achieve broader views. In his TEMS system,
Decker uses three variations of t h s policy: communicate no local information,
communicate all information, or an intermediate level.
Communicate results. Agents may communicate results in three different ways.
A minimal approach is where agents only communicate results that are essential
to satisfy obligations. Another approach involves sending all results. A third is
to send results to those with an interest in them.
Handling simple redundancy. Redundancy occurs when efforts are duplicated.
This may be deliberate - an agent may get more than one agent to work on a task
because it wants to ensure the task gets done. However, in general, redundan-
cies indicate wasted resources, and are therefore to be avoided. The solution
adopted in GPGP is as follows. When redundancy is detected, in the form of
204 Working Together
on a common point (the tree). In this case, the individuals are perform-
ing exactly the same actions as before, but because they each have the
aim of meeting at the central point as a consequence of the overall aim
of executing the dance, this is cooperative action.
How does having an individual intention towards a particular goal differ from
being part of a team, with some sort of collective intention towards the goal?
The distinction was first studied in Levesque et al. (1990),where it was observed
that being part of a team implies some sort of responsibility towards the other
members of the team. To illustrate t h s , suppose that you and I are together lifting
a heavy object as part of a team activity. Then clearly we both individually have
the intention to lift the object - but is there more to teamwork than this? Well,
suppose I come to believe that it is not going to be possible to lift it for some
reason. If I just have an individual goal to lift the object, then the rational thing
for me to do is simply drop the intention (and thus perhaps also the object).
But you would hardly be inclined to say I was cooperating with you if I did so.
Being part of a team implies that I show some responsibility towards you: that if
I discover the team effort is not going to work, then I should at least attempt to
make you aware of t h s .
Building on the work of Levesque et al. (1990),Jennings distinguished between
the commitment that underpins an intention and the associated convention
(Jennings, 1993a). A commitment is a pledge or a promise (for example, to have
lifted the object); a convention in contrast is a means of monitoring a commit-
ment - it specifies under what circumstances a commitment can be abandoned
and how an agent should behave both locally and towards others when one of
these conditions arises.
In more detail, one may commit either to a particular course of action, or, more
generally, to a state of affairs. Here, we are concerned only with commitments
that are future directed towards a state of affairs. Commitments have a number
of important properties (see Jennings (1993a) and Cohen and Levesque (1990a,
pp. 217-219) for a discussion), but the most important is that commitments per-
sist: having adopted a commitment, we do not expect an agent to drop it until,
for some reason, it becomes redundant. The conditions under which a comrnit-
ment can become redundant are specified in the associated convention - exam-
ples include the motivation for the goal no longer being present, the goal being
acheved, and the realization that the goal will never be achieved (Cohen and
Levesque, 1990a).
When a group of agents are engaged in a cooperative activity they must have a
joint commitment to the overall aim, as well as their individual commitments to
the specific tasks that they have been assigned. This joint commitment shares the
persistence property of the individual commitment; however, it differs in that its
state is distributed amongst the team members. An appropriate social convention
must also be in place. This social convention identifies the conditions under which
the joint commitment can be dropped, and also describes how an agent should
206 Working Together
behave towards its fellow team members. For example, if an agent drops its joint
commitment because it believes that the goal will never be attained, then it is part
of the notion of 'cooperativeness' that is inherent in joint action that it informs
all of its fellow team members of its change of state. In this context, social con-
ventions provide general guidelines, and a common frame of reference in which
agents can work. By adopting a convention, every agent knows what is expected
both of it, and of every other agent, as part of the collective worlung towards the
goal, and knows that every other agent has a similar set of expectations.
We can begin to define this lund of cooperation in the notion of a joint persistent
goal (JPG),as defined in Levesque et al. (1990). In a JPG, a group of agents have
a collective commitment to bringing about some goal q ;the motivation for t h s
goal, i.e. the reason that the group has the commitment, is represented by +.
Thus q might be 'move the heavy object', while + might be 'Michael wants the
heavy object moved'. The mental state of the team of agents with this JPG might
be described as follows:
initially, every agent does not believe that the goal p is satisfied, but believes
q is possible;
every agent i then has a goal of p until the termination condition is satisfied
(see below);
until the termination condition is satisfied, then
- if any agent i believes that the goal is acheved, then it will have a goal
that this becomes a mutual belief, and will retain this goal until the
termination condition is satisfied;
- if any agent i believes that the goal is impossible, then it will have a
goal that this becomes a mutual belief, and will retain t h s goal until
the termination condition is satisfied;
- if any agent i believes that the motivation + for the goal is no longer
present, then it will have a goal that this becomes a mutual belief, and
will retain this goal until the termination condition is satisfied;
the termination condition is that it is mutually believed that either
- the goal q is satisfied;
- the goal q is impossible to acheve;
- the motivation/justification + for the goal is no longer present.
Commitments and conventions in ARCHON
Jennings (1993a, 1995) investigated the use of commitments and such as JPGs in
the coordination of an industrial control system called ARCHON (Wittig, 1992;Jen-
nings et al., 1996a; Perriolat et al., 1996).He noted that commitments and conven-
tions could be encoded as rules in a rule-based system. T h s makes it possible to
explicitly encode coordination structures in the reasoning mechanism of an agent.
Coordination 207
++ -
interagent
r
communication manager
- - - - - . . . . . . . . . . . . . . . . . . . . communication
acquaintance
model
cooperat ion ,'
module
7
self
4 model
J
situation
assessment
module information
store
I\
i
-------------
. control module
Figure 9.5 ARCHON agent architecture.
Match rules: :
if t a s k t has f i n i s h e d e x e c u t i n g and
t has produced d e s i r e d outcome o f j o i n t a c t i o n
t h e n j o i n t goa7 i s s a t i s f i e d .
if r e c e i v e i n f o r m a t i o n i and
i i s related t o triggering conditions
f o r j o i n t goa7 G and
i inva7idates b e 7 i e f s f o r wanting G
t h e n m o t i v a t i o n f o r G i s no 7onger p r e s e n t .
if d e l a y t a s k t l and
t l i s a component o f common r e c i p e R and
t l must be s y n c h r o n i z e d w i t h t 2 i n R
then R i s vio7ated.
if f i n i s h e d e x e c u t i n g common r e c i p e R and
expected r e s u 7 t s o f R n o t produced and
a7ternative recipe e x i s t s
then R i s inva7id.
Select rules:
if j o i n t goa7 i s s a t i s f i e d
t h e n abandon a 7 7 a s s o c i a t e d 70ca 7 a c t i v i t i e s and
i n f o r m c o o p e r a t i o n modu7e
if m o t i v a t i o n f o r j o i n t goa7 no 7onger p r e s e n t
t h e n abandon a77 a s s o c i a t e d 7oca7 a c t i v i t i e s and
i n f o r m c o o p e r a t i o n modu7e
if common r e c i p e R i s v i o 7 a t e d and
R can be reschedu7ed
t h e n suspend 7oca7 a c t i v i t i e s a s s o c i a t e d w i t h R and
r e s e t t i m i n g s and d e s c r i p t i o n s a s s o c i a t e d w i t h R and
i n f o r m c o o p e r a t i o n modu7e
if common r e c i p e R 1 i s i n v a 7 i d and
a ?t e r n a t i v e r e c i p e R2 e x i s t s
t h e n abandon a77 7oca7 a c t i v i t i e s w i t h R 1 and
i n f o r m c o o p e r a t i o n modu7e t h a t R 1 i s i n v a l i d and
propose R2 t o coopera t i o n modu 7e
Chapter 7). At the conclusion of this stage, the team will have agreed to the ends
to be achieved (i.e. to the principle of joint action), but not to the means (i.e. the
way in which this end will be achieved). Note that the agents are assumed to
be rational, in the sense that they will not form a team unless they implicitly
believe that the goal is achievable.
(3) Plan formation. We saw above that a group will not form a collective unless
they believe they can actually achieve the desired goal. This, in turn, implies
there is at least one action known to the group that will take them 'closer' to
the goal. However, it is possible that there are many agents that know of actions
the group can perform in order to take them closer to the goal. Moreover, some
members of the collective may have objections to one or more of these actions.
It is therefore necessary for the collective to come to some agreement about
exactly which course of action they will follow. Such an agreement is reached
via negotiation or argumentation, of exactly the kind discussed in Chapter 7.
(4) Team action. During this stage, the newly agreed plan of joint action is exe-
cuted by the agents, which maintain a close-knit relationship throughout. T h s
relationshp is defined by a convention, which every agent follows. The JPG
described above might be one possible convention.
Note that - as the name of the approach suggests - explicit communication is not
necessary in this scenario.
MACE
Les Gasser's MACE system, developed in the mid-1980s, can, with some justifi-
cation, claim to be the first general experimental testbed for multiagent systems
(Gasser et al., 1987a,b).MACE is noteworthy for several reasons, but perhaps most
importantly because it brought together most of the components that have sub-
sequently become common in testbeds for developing multiagent systems. I men-
tion it in this section because of one critical component: the acquaintance models,
which are discussed in more detail below. Acquaintance models are representa-
tions of other agents: their abilities, interests, capabilities, and the like.
A MACE system contains five components:
a collection of application agents, which are the basic computational units
in a MACE system (see below);
a collection of predefined system agents, which provide service to users
(e.g. user interfaces);
a collection of facilities, available to all agents (e.g. a pattern matcher);
a description database, which maintains agent descriptions, and produces
executable agents from those descriptions; and
a set of kernels, one per physical machine, which handle communication and
message routing, etc.
Gasser et al. identified three aspects of agents: they contain knowledge, they
sense their environment, and they perform actions (Gasser et al., 1987b1p. 124).
Agents have two kinds of knowledge: specialized, local, domain knowledge, and
acquaintance knowledge - knowledge about other agents. An agent maintains the
following information about its acquaintances (Gasser et al., 1987b, pp. 126, 12 7).
Class. Agents are organized in structured groups called classes, which are iden-
tified by a class name.
Name. Each agent is assigned a name, unique to its class - an agent's address is
a (class, n a m e ) pair.
Roles. A role describes the part an agent plays in a class.
Skills. Skills are what an agent knows are the capabilities of the modelled agent.
Goals. Goals are what the agent knows the modelled agent wants to achieve.
Plans. Plans are an agent's view of the way a modelled agent will achieve its goals.
Agents sense their environment primarily through receiving messages. An agent's
ability to act is encoded in its engine. An engine is a LISP function, evaluated
by default once on every scheduling cycle. The only externally visible signs of
212 Working Together
((NAME p l us-ks)
(IMPORT ENGINE FROM dbb-def)
(ACQUAINTANCES
( p l us-ks
. . . model f o r p l us-ks ...
)
(de-exp
[ROLE (ORG-MEMBER)]
[GOALS ( . . . g o a l l i s t . . . ) ]
[SKILLS ( . . . s k i l l l i s t )] ...
[PLANS ( . . . p l a n l i s t . . . ) I
1
( s i rnpl e - p l us
. . . a c q u a i n t a n c e model f o r s i m p l e - p l u
1
1
(INIT-CODE ( . . . LISP code . . . ) )
) ; end o f p l u s - k s
an agent's activity are the messages it sends to other agents. Messages may be
directed to a single agent, a group of agents, or all agents; the interpretation of
messages is left to the programmer to define.
An example MACE agent is shown in Figure 9.7. The agent modelled in this
example is part of a simple calculator system implemented using the black-
board model. The agent being modelled here is called PLUS-KS. It is a knowledge
source which knows about how to perform the addition operation. The PLUS-KS
knowledge source is the 'parent' of two other agents; DE-EXP, an agent whch
knows how to decompose simple expressions into their primitive components,
and SIMPLE-PLUS, an agent which knows how to add two numbers.
The definition frame for the PLUS-KS agent consists of a name for the agent - in
this case PLUS-KS - the engine, which defines what actions the agent may perform
(in this case the engine is imported, or inherited, from an agent called DBB-DEF),
and the acquaintances of the agent.
The acquaintances slot for PLUS-KS defines models for three agents. Firstly, the
agent models itself. T h s defines how the rest of the world will see PLUS-KS. Next,
the agents DE-EXP and SIMPLE-PLUS are modelled. Consider the model for the
agent DE-EXP. The role slot defines the relationship of the modelled agent to the
modeller. In this case, both DE-EXP and SIMPLE-PLUS are members of the class
defined by PLUS-KS. The COALS slot defines what the modelling agent believes
the modelled agent wants to achieve. The SKILLS slot defines what resources the
modeller believes the modelled agent can provide. The PLANS slot defines how
the modeller believes the modelled agent will achieve its goals. The PLANS slot
Coordination 213
consists of a list of skills, or operations, which the modelled agent will perform
in order to achieve its goals.
Gasser et al. described how MACE was used to construct blackboard systems, a
Contract Net system, and a number of other experimental systems (see Gasser et
aL, 1987b, 1989, pp. 138-140).
there are a number of disadvantages with this approach. First, it is not always
the case that all the characteristics of a system are known at design time. ( T h s
is most obviously true of open systems such as the Internet.) In such systems,
the ability of agents to organize themselves would be advantageous. Secondly, in
complex systems, the goals of agents (or groups of agents) might be constantly
changing. To keep reprogramming agents in such circumstances would be costly
and inefficient. Finally, the more complex a system becomes, the less likely it is
that system designers will be able to design effective norms or social laws: the
dynamics of the system - the possible 'trajectories' that it can take - will be too
hard to predict. Here, flexibility within the agent society might result in greater
coherence.
to develop a strategy update function such that, when it is used by every agent in
the society, will bring the society to a global agreement as efficiently as possible.
In Shoham and Tennenholtz (1992b, 1997) and Walker and Wooldridge (1995),
a number of different strategy update functions were evaluated as follows.
Simple majority. This is the simplest form of update function. Agents will change
to an alternative strategy if so far they have observed more instances of it in
other agents than their present strategy. If more than one strategy has been
observed more than that currently adopted, the agent will choose the strategy
observed most often.
Simple majority with agent types. As simple majority, except that agents are
divided into two types. As well as observing each other's strategies, agents in
these experiments can communicate with others whom they can 'see', and who
are of the same type. When they communicate, they exchange memories, and
each agent treats the other agent's memory as if it were his own, thus being
able to take advantage of another agent's experiences. In other words, agents
are particular about whom they confide in.
Simple majority with communication on success. This strategy updates a form
of communication based on a success threshold. When an individual agent has
reached a certain level of success with a particular strategy, he communicates
his memory of experiences with t h s successful strategy to all other agents that
he can 'see'. Note, only the memory relating to the successful strategy is broad-
cast, not the whole memory. The intuition behind this update function is that an
agent will only communicate with another agent when it has something mean-
ingful to say. T h s prevents 'noise' communication.
Highest cumulative reward. For this update to work, an agent must be able to
see that using a particular strategy gives a particular payoff (cf. the discussion in
Chapter 6). The highest cumulative reward update rule then says that an agent
uses the strategy that it sees has resulted in the highest cumulative payoff to
date.
In addition, the impact of memory restarts on these strategies was investigated.
Intuitively, a memory restart means that an agent periodically 'forgets' everything
it has seen to date - its memory is emptied, and it starts as if from scratch again.
The intuition b e h n d memory restarts is that it allows an agent to avoid being
over-committed to a particular strategy as a result of history: memory restarts
I thus make an agent more 'open to new ideas'.
The efficiency of convergence was measured by Shoham and Tennenholtz
(1992b) primarily by the time taken to convergence: how many rounds of the tee
shirt game need to be played before all agents converge on a particular strategy.
However, it was noted in Walker and Wooldridge (1995) that changing from one
strategy to another can be expensive. Consider a strategy such as using a partic-
ular kind of computer operating system. Changing from one to another has an
2 16 Working Together
associated cost, in terms of the time spent to learn it, and so we do not wish to
change too frequently. Another issue is that of stability. We do not usually want
our society to reach agreement on a particular strategy, only for it then to imme-
diately fall apart, with agents reverting to different strategies.
When evaluated in a series of experiments, all of the strategy update functions
described above led to the emergence of particular conventions within an agent
society. However, the most important results were associated with the highest
cumulative reward update function (Shoham and Tennenholtz, 1997, pp. 150,
151). It was shown that, for any value E such that 0 < E < 1, there exists some
bounded value n such that a collection of agents using the highest cumulative
reward update function will reach agreement on a strategy in n rounds with proba-
bility 1 - E . Furthermore, it was shown that this strategy update function is stable
in the sense that, once reached, the agents would not diverge from the norm.
Finally, it was shown that the strategy on which agents reached agreement was
'efficient', in the sense that it guarantees agents a payoff no worse than that they
would have received had they stuck with the strategy they initially chose.
The next question is to define what is meant by a useful social law. The answer
is to define a set F G E of focal states. The intuition here is that these are the
states that are always legal, in that an agent should always be able to 'visit' the
focal states. To put it another way, whenever the environment is in some focal
state e E F, it should be possible for the agent to act so as to be able to guar-
antee that any other state e' E F is brought about. A useful social law is then
one that does not constrain the actions of agents so as to make this impossi-
ble.
The useful social law problem can then be understood as follows.
Given an environment E n v = (E, T , e O )and a set of focal states F G E,
find a useful social law if one exists, or else announce that none exists.
In Shoham and Tennenholtz (1992b, 1996), it is proved that this problem is NP-
complete, and so is unlikely to be soluble by 'normal' computing techniques in
reasonable time. Some variations of the problem are discussed in Shoham and
Tennenholtz (1992b, 1996),and some cases where the problem becomes tractable
are examined. However, these tractable instances do not appear to correspond to
useful real-world cases.
Plan merging
Georgeff (198 3) proposed an algorithm which allows a planner to take a set a plans
generated by single agents, and from them generate a conflict free (but not neces-
sarily optimal) multiagent plan. Actions are specified by using a generalization of
the STRIPS notation (Chapter 4). In addition to the usual precondition-delete-add
lists for actions, Georgeff proposes using a during list. This list contains a set of
conditions which must hold while the action is being carried out. A plan is seen as
a set of states; an action is seen as a function which maps the set onto itself. The
precondition of an action specifies the domain of the action; the add and delete
lists specify the range.
Given a set of single agent plans specified using the modified STRIPS notation,
generating a synchronized multiagent plan consists of three stages.
(1) Interaction analysis. Interaction analysis involves generating a description of
how single agent plans interact with one another. Some of these interactions will
be harmless; others will not. Georgeff used the notions of satisfiability, commu-
tativity, and precedence to describe goal interactions. Two actions are said to
be satisfiable if there is some sequence in which they may be executed without
invalidating the preconditions of one or both. Commutativity is a restricted case
of satisfiability: if two actions may be executed in parallel, then they are said
to be commutative. It follows that if two actions are commutative, then either
they do not interact, or any interactions are harmless. Precedence describes the
sequence in which actions may be executed; if action or1 has precedence over
action a2,then the preconditions of a2 are met by the postconditions of al.
That is not to say that a1 must be executed before a2;it is possible for two
actions to have precedence over each other.
Interaction analysis involves searching the plans of the agents to detect any
interactions between them.
(2) Safety analysis. Having determined the possible interactions between plans,
it now remains to see whch of these interactions are unsafe. Georgeff defines
safeness for pairs of actions in terms of the precedence and commutativity of
the pair. Safety analysis involves two stages. First, all actions which are harmless
(i.e. where there is no interaction, or the actions commute) are removed from
the plan. This is known as simplification. Georgeff shows that the validity of
the final plan is not affected by this process, as it is only boundary regions that
need to be considered. Secondly, the set of all harmful interactions is generated.
This stage also involves searching; a rule known as the commutativity theorem
220 Working Together
is applied to reduce the search space. All harmful interactions have then been
identified.
(2) The formulae are conjoined and fed into an LTL theorem prover. If the con-
joined formula is satisfiable, then the theorem prover will generate a set
of sequences of actions which satisfy these formulae. These sequences are
encoded in a graph structure. If the formula is not satisfiable, then the the-
orem prover will report this.
(3) The graph generated as output encodes all the possible synchronized exe-
cutions of the plans. A synchronized plan is then 'read off' from the graph
structure.
Multiagent Planning and Synchronization 22 1
One issue that I have been forced to omit from this chapter due to space and
time limitations is the use of normative specifications in multiagent systems, and,
in particular, the use of deontic logic (Meyer and Wieringa, 1993). Deontic logic
is the logic of obligations and permissions. Originally developed within formal
philosophy, deontic logic was been taken up by researchers in computer science
in order to express the desirable properties of computer systems. Dignum (1999)
gives an overview of the use of deontic logic in multiagent systems, and also
discusses the general issue of norms and social laws.
Exercises
(1) [Level 2.1
Using the FIPA or KQML languages (see preceding chapter), describe how you would
implement the Contract Net protocol.
(2) [Level 3.1
Implement the Contract Net protocol using Java (or your programming language of
choice).You m g h t implement agents as threads, and have tasks as (for example) factoring
numbers. Have an agent that continually generates new tasks and allocates them to an
agent, which must distribute them to others.
(3) [Level 3.1
Download an FIPA or KQML system (such as Jade or JATLite - see preceding chapter),
and use it to re-implement your Contract Net system.
Methodologies
as societies of agents, either cooperating with each other to solve complex prob-
lems, or else competing with one another. Sometimes, as in intelligent inter-
faces, the idea of an agent is seen as a natural metaphor: Maes (1994a) dis-
cusses agents as 'expert assistants', cooperating with the user to work on some
problem.
Distribution of data, control or expertise. In some environments, the distribu-
tion of either data, control, or expertise means that a centralized solution is
at best extremely difficult or at worst impossible. For example, distributed
database systems in which each database is under separate control do not
generally lend themselves to centralized solutions. Such systems may often
be conveniently modelled as multiagent systems, in which each database is a
semi-autonomous component.
Legacy systems. A problem increasingly faced by software developers is that
of legacy: software that is technologically obsolete but functionally essential
to an organization. Such software cannot generally be discarded, because of
the short-term cost of rewriting. And yet it is often required to interact with
other software components, which were never imagined by the original design-
ers. One solution to this problem is to wrap the legacy components, pro-
viding them with an 'agent layer' functionality, enabling them to communi-
cate and cooperate with other software components (Genesereth and Ketch-
pel, 1994).
iI Responsibilities Acquaintances
Protocols
Activities
I Liveness properties
Safety properties
key attribute associated with a role. An example responsibility associated with the
role of company president might be calling the shareholders meeting every year.
Responsibilities are divided into two types: liveness properties and safety prop-
erties (Pnueli, 1986). Liveness properties intuitively state that 'something good
happens'. They describe those states of affairs that an agent must bring about,
given certain environmental conditions. In contrast, safety properties are invari-
ants. Intuitively, a safety property states that 'nothing bad happens' (i.e. that an
acceptable state of affairs is maintained across all states of execution). An example
might be 'ensure the reactor temperature always remains in the range 0-100'.
In order to realize responsibilities, a role has a set of permissions. Permissions
are the 'rights' associated with a role. The permissions of a role thus identify
the resources that are available to that role in order to realize its responsibili-
ties. Permissions tend to be information resources. For example, a role might have
associated with it the ability to read a particular item of information, or to modify
another piece of information. A role can also have the ability to generate infor-
mation.
The activities of a role are computations associated with the role that may be
carried out by the agent without interacting with other agents. Activities are thus
'private' actions, in the sense of Shoham (1993).
Finally, a role is also identified with a number of protocols, which define the
way that it can interact with other roles. For example, a 'seller' role might have
the protocols 'Dutch auction' and 'English auction' associated with it; the Contract
Net protocol is associated with the roles 'manager' and 'contractor' (Smith, 1980b).
; enable the modelling of agent systems (Ode11 et al., 2001; Bauer et al., 2001). The
proposed modifications include:
support for expressing concurrent threads of interaction (e.g. broadcast
messages), thus enabling UML to model such well-known agent protocols
as the Contract Net (Chapter 9);
a notion of 'role' that extends that provided in UML, and, in particular, allows
the modelling of an agent playing many roles.
Both the Object Management Group (OMG, 2001), and FIPA (see Chapter 8) are cur-
rently supporting the development of UML-based notations for modelling agent
systems, and there is therefore likely to be considerable work in this area.
four-tiered herarchy of the entities that can exist in an agent-based system. They
start with entities, whch are inanimate objects - they have attributes (colour,
weight, position) but nothing else. They then define objects to be entities that
have capabilities (e.g. tables are entities that are capable of supporting things).
Agents are then defined to be objects that have goals, and are thus in some sense
active; finally, autonomous agents are defined to be agents with motivations. The
idea is that a chair could be viewed as taking on my goal of supporting me when
I am using it, and can hence be viewed as an agent for me. But we would not view
a chair as an autonomous agent, since it has no motivations (and cannot easily be
attributed them). Starting from this basic framework, Luck and d'Inverno go on
to examine the various relationshps that might exist between agents of different
types. In Luck et a/. (1997), they examine how an agent-based system specified
in their framework might be implemented. They found that there was a natural
relationship between their hierarchical agent specification framework and object-
oriented systems.
The formal definitions of agents and autonomous agents rely on inher-
iting the properties of lower-level components. In the Z notation, t h s
is acheved through schema inclusion.. .. T h s is easily modelled in C++
by deriving one class from another. . . . Thus we move from a principled
but abstract theoretical framework through a more detailed, yet still
formal, model of the system, down to an object-oriented implementa-
tion, preserving the herarchical structure at each stage.
(Luck et al., 1997)
Discussion
The predominant approach to developing methodologies for multiagent systems
is to adapt those developed for object-oriented analysis and design (Booch, 1994).
There are several disadvantages with such approaches. First, the kinds of decom-
position that object-oriented methods encourage is at odds with the kind of
Pitfalls o f Agent Development 233
learned from the concurrent and distributed systems community - the problems
inherent in multi-threaded systems do not go away, just because you adopt an
agent-based approach.
Your design does not exploit concurrency. One of the most obvious features of
a poor multiagent design is that the amount of concurrent problem solving is
comparatively small or even in extreme cases non-existent. If there is only ever
a need for a single thread of control in a system, then the appropriateness of
an agent-based solution must seriously be questioned.
You decide you want your own agent architecture. Agent architectures are
essentially templates for building agents. When first attempting an agent
project, there is a great temptation to imagine that no existing agent archi-
tecture meets the requirements of your problem, and that it is therefore nec-
essary to design one from first principles. But designing an agent architecture
from scratch in t h s way is often a mistake: my recommendation is therefore
to study the various architectures described in the literature, and either license
one or else implement an 'off-the-shelf' design.
Your agents use too much AI. When one builds an agent application, there is an
understandable temptation to focus exclusively on the agent-specific, 'intelli-
gence' aspects of the application. The result is often an agent framework that
is too overburdened with experimental techniques (natural language interfaces,
planners, theorem provers, reason maintenance systems, etc.) to be usable.
You see agents everywhere. When one learns about multiagent systems for the
first time, there is a tendency to view everything as an agent. This is perceived
to be in some way conceptually pure. But if one adopts t h s viewpoint, then
one ends up with agents for everythng, including agents for addition and sub-
traction. It is not difficult to see that naively viewing everything as an agent in
this way will be extremely inefficient: the overheads of managing agents and
inter-agent communication will rapidly outweigh the benefits of an agent-based
solution. Moreover, we do not believe it is useful to refer to very fine-grained
computational entities as agents.
You have too few agents. Whle some designers imagine a separate agent for
every possible task, others appear not to recognize the value of a multiagent
approach at all. They create a system that completely fails to exploit the power
offered by the agent paradigm, and develop a solution with a very small number
of agents doing all the work. Such solutions tend to fail the standard software
engineering test of cohesion, whch requires that a software module should
have a single, coherent function. The result is rather as if one were to write an
object-oriented program by bundling all the functionality into a single class. It
can be done, but the result is not pretty.
You spend all your time implementing infrastructure. One of the greatest obs-
tacles to the wider use of agent technology is that there are no widely used
2 36 Methodologies
Mobile Agents
So far in t h s book I have avoided mention of an entire species of agent, whch
has aroused much interest, particularly in the programming-language and object-
oriented-development community. Mobile agents are agents that are capable of
transmitting themselves - their program and their state - across a computer net-
work, and recommencing execution at a remote site. Mobile agents became known
largely through the pioneering work of General Magic, Inc., on their Telescript pro-
gramming language, although there are now mobile agent platforms available for
many languages and platforms (see Appendix A for some notes on the hstory of
mobile agents).
The original motivation behnd mobile agents is simple enough. The idea was
that mobile agents would replace remote procedure calls as a way for processes
to communicate over a network - see Figure 10.1. With remote procedure calls,
the idea is that one process can invoke a procedure (method) on another process
which is remotely located. Suppose one process A invokes a method m on pro-
cess B with arguments args; the value returned by process B is to be assigned to
a variable v. Using a Java-like notation, A executes an instruction somewhat like
the following:
Mobile Agents 237
server process
network
server computer
client computer
I7
I server process
network
Figure 10.1 Remote procedure calls (a) versus mobile agents (b).
Many different answers have been developed to address these issues. With respect
to the first issue - that of how to serialize and transmit an agent - there are several
possibilities.
Both the agent and its state are transmitted, and the state includes the pro-
gram counter, i.e. the agent 'remembers' where it was before it was transrnit-
ted across the network, and when it reaches its destination, it recommences
execution at the program instruction following that which caused it to be
transmitted. This is the kind of mobility employed in the Telescript language
(White, 1994, 1997).
The agent contains both a program and the values of variables, but not the
'program counter', so the agent can remember the values of all variables,
but not where it was when it transmitted itself across the network. T h s is
how Danny Lange's Java-based Aglets framework works (Lange and Oshima,
1999).
Mobile Agents 2 39
Telescript
Telescript was a language-based environment for constructing multiagent sys-
tems developed in the early 1990s by General Magic, Inc. It was a commercial
product, developed with the then very new palm-top computing market in mind
(White, 1994, 1997).
There are two key concepts in Telescript technology: places and agents. Places
are virtual locations that are occupied by agents - a place may correspond to a sin-
gle machine, or a family of machines. Agents are the providers and consumers of
goods in the electronic marketplace applications that Telescript was developed to
support. Agents in Telescript are interpreted programs; the idea is rather similar
to the way that Java bytecodes are interpreted by the Java virtual machine.
Telescript agents are able to move from one place to another, in which case their
program and state are encoded and transmitted across a network to another place,
where execution recommences. In order to travel across the network, an agent
uses a ticket, which specifies the parameters of its journey:
the agent's destination;
the time at whch the journey will be completed.
Telescript agents communicate with one another in several different ways:
if they occupy different places, then they can connect across a network;
if they occupy the same location, then they can meet one another.
Telescript agents have an associated permit, which specifies what the agent can
do (e.g. limitations on travel), and what resources the agent can use. The most
important resources are
240 Methodologies
I
The core of an Aglet - the bit that does the work - is the run() method. T h s
defines the behaviour of the Aglet. Inside a run () method, an Aglet can execute
i the d i spatch () method, in order to transmit itself to a remote destination. An
I
Class reading: Kinny and Georgeff (1997). This article describes arguably the
first agent-specific methodology. For a class familiar with 00 methodologies, it
may be worth discussing the similarities, differences, and what changes might be
required to make this methodology really usable in practice.
244 Methodologies
Exercises
( 1 ) [Class discussion.]
For classes with some familiarity with object-oriented development: decomposition is
perhaps the critical issue in an analysis and design methodology. Discuss the differences
in the decomposition achieved with 00 techniques to those of an agent system. What is
the right 'grain size' for an agent? When we do analysis and design for agent systems,
what are the key attributes that we need to characterize an agent in terms of?
( 2 ) [Class discussion.]
With respect to mobile agent systems, discuss those circumstances where a mobile
agent solution is essential - where you cannot imagine how it could be done without
mobility.
(3) [Level 2/3.]
Use GAIA or the AAII methodology to do an analysis and design of a system with whch
you are familiar (if you are stuck for one, read about the ADEPT system described in
Chapter 11). Compare it with an 00 analysis and design approach.
(4) [Level 4.1
Extend the UML notation to incorporate agent facilities (such as communication in an
agent communication languages). How might you capture the fact that agents are self-
interested?
Applications
Human factors
The most obvious difficulty from the point of view of human users of the World-
Wide Web is the 'information overload' problem (Maes, 1994a). People get over-
whelmed by the sheer amount of information available, malung it hard for them
to filter out the junk and irrelevancies and focus on what is important, and also
to actively search for the right information. Search engines such as Google and
Yahoo attempt to alleviate t h s problem by indexing largely unstructured and
unmanaged information on the Web. While these tools are useful, they tend to
lack functionality: most search engines provide only simple search features, not
tailored to a user's particular demands. In addition, current search engine func-
tionality is directed at textual (typically HTML) content - despite the fact that
one of the main selling features of the Web is its support for heterogeneous,
multi-media content. Finally, it is not at all certain that the brute-force indexing
techniques used by current search engines will scale to the size of the Internet in
the next century. So finding and managing information on the Internet is, despite
tools such as Google, still a problem.
In addition, people easily get bored or confused while browsing the Web. The
hypertext nature of the Web, while making it easy to link related documents
together, can also be disorienting - the 'back' and 'forward' buttons provided
by most browsers are better suited to linear structures than the highly connected
graph-like structures that underpin the Web. This can make it hard to understand
the topology of a collection of linked Web pages; indeed, such structures are inher-
ently difficult for humans to visualize and comprehend. In short, it is all too easy
to become lost in cyberspace. When searclung for a particular item of information,
it is also easy for people to either miss or misunderstand things.
Finally, the Web was not really designed to be used in a methodical way. Most
Web pages attempt to be attractive and highly animated, in the hope that people
will find them interesting. But there is some tension between the goal of mak-
ing a Web page animated and diverting and the goal of conveying information.
Of course, it is possible for a well-designed Web page to effectively convey infor-
mation, but, sadly, most Web pages emphasize appearance, rather than content.
It is telling that the process of using the Web is known as 'browsing' rather than
'reading'. Browsing is a useful activity in many circumstances, but is not generally
appropriate when attempting to answer a complex, important query.
2 50 Applications
Organizational factors
In addition, there are many organizational factors that make the Web difficult to
use. Perhaps most importantly, apart from the (very broad) HTML standard, there
are no standards for how a Web page should look.
Another problem is the cost of providing online content. Unless significant
information owners can see that they are malung money from the provision of
their content, they will simply cease to provide it. How this money is to be made
is probably the dominant issue in the development of the Web today. I stress
that these are not criticisms of the Web - its designers could hardly have antici-
pated the uses to which it would be put, nor that they were developing one of the
most important computer systems to date. But these are all obstacles that need
to be overcome if the potential of the Internet/Web is to be realized. The obvious
question is then: what more do we need?
In order to realize the potential of the Internet, and overcome the limitations
discussed above, it has been argued that we need tools that (Durfee et al., 1997)
give a single coherent view of distributed, heterogeneous information
resources;
give rich, personalized, user-oriented services, in order to overcome the
'information overload' problem - they must enable users to find informa-
tion they really want to find, and shield them from information they do not
want;
are scalable, distributed, and modular, to support the expected growth of
the Internet and Web;
are adaptive and self-optimizing, to ensure that services are flexible and
efficient.
MAXIMS works by 'looking over the shoulder' of a user, and learning about how
they deal with email. Each time a new event occurs (e.g. email arrives), MAXIMS
records the event in the form of
situation - action
pairs. A situation is characterized by the following attributes of an event:
Agents for Information Retrieval and Management 251
sender of email;
recipients;
subject line;
keywords in message body and so on.
When a new situation occurs, MAXIMS matches it against previously recorded
rules. Using these rules, it then tries to predict what the user will do, and generates
a confidence level: a real number indicating how confident the agent is in its deci-
sion. The confidence level is matched against two preset real number thresholds:
a 'tell me' threshold and a 'do it' threshold. If the confidence of the agent in its
decision is less than the 'tell me' threshold, then the agent gets feedback from the
user on what to do. If the confidence of the agent in its decision is between the 'tell
me' and 'do it' thresholds, then the agent makes a suggestion to the user about
what to do. Finally, if the agent's confidence is greater than the 'do it' threshold,
then the agent takes the initiative, and acts.
Rules can also be hard coded by users (e.g. 'always delete mails from person
XI). MAXIMS has a simple 'personality' (an animated face on the user's GUI), which
communicates its 'mental state' to the user: thus the icon smiles when it has made
a correct guess, frowns when it has made a mistake, and so on.
The NewT system is a Usenet news filter (Maes, 1994~1, pp. 38, 39).A NewT agent
is trained by giving it a series of examples, illustrating articles that the user would
and would not choose to read. The agent then begins to make suggestions to the
user, and is given feedback on its suggestions. NewT agents are not intended to
remove human choice, but to represent an extension of the human's wishes: the
aim is for the agent to be able to bring to the attention of the user articles of
the type that the user has shown a consistent interest in. Similar ideas have been
proposed by McGregor, who imagines prescient agents - intelligent administrative
assistants, that predict our actions, and carry out routine or repetitive adminis-
trative procedures on our behalf (McGregor, 1992).
Web agents
Etzioni and Weld (1995) identify the following specific types of Web-based agent
they believe are likely to emerge in the near future.
Tour guides. The idea here is to have agents that help to answer the question
'where do I go next' when browsing the Web. Such agents can learn about the
user's preferences in the same way that MAXIMS does, and, rather than just
providing a single, uniform type of hyperlink, they actually indicate the likely
interest of a link.
Indexing agents. Indexing agents will provide an extra layer of abstraction on top
of the services provided by search/indexing agents such as Google and InfoS-
eek. The idea is to use the raw information provided by such engines, together
252 Applications
personalization,
added value
high
Ahoy
AltaVista
indices, directories
Yahoo search engines, crawlers
low
done in one of two ways. The simplest is to have humans search for pages and
classify them manually. Thls has the advantage that the classifications obtained
in this way are likely to be meaningful and useful. But it has the very obvious
disadvantage that it is not necessarily thorough, and is costly in terms of human
I
: resources. The second approach is to use simple software agents, often called
spiders, to systematically search the Web, following all links, and automatically
!
classifying content. The classification of content is typically done by removing
'noise' words from the page ('the', 'and', etc.), and then attempting to find those
words that have the most meaning.
All current search engines, however, suffer from the disadvantage that their
coverage is partial. Etzioni (1996) suggested that one way around t h s is to use
a meta search engine. T h s search engine works not by directly maintaining a
database of pages, but by querying a number of search engines in parallel. The
results from these search engines can then be collated and presented to the user.
The meta search engine thus 'feeds' off the other search engines. By allowing the
engine to run on the user's machne, it becomes possible to personalize services -
to tailor them to the needs of individual users.
info
agent
repository n
repository 3
Brokered systems are able to cope more quickly with a rapidly fluctuating agent
population. Middle agents allow a system to operate robustly in the face of inter-
mittent communications and agent appearance and disappearance.
The overall behaviour of a system such as that in Figure 11.2 is that a user issues
a query to an agent on their local machine. Thls agent may then contact informa-
tion agents directly, or it may go to a broker, which is skilled at the appropriate
type of request. The broker may then contact a number of information agents,
asking first whether they have the correct skills, and then issuing specific queries.
This kind of approach has been successfully used in digital library applications
(Wellman et al., 1996).
Need identification x x x x
Product brokering x x x
Merchant brokering x x x x x
Negotiation x x x
Purchase & delivery
Service & evaluation
Agents have been widely promoted as being able to automate (or at least partially
automate) some of these stages, and hence assist the consumer to reach the best
deal possible (Noriega and Sierra, 1999). Table 11.1 (from Guttman et al., 1998)
summarizes the extent to which currently developed agents can help in each stage.
Navigation regularity. Web sites are designed by vendors so that products are
easy to find.
Corporate regu.larity. Web sites are usually designed so that pages have a similar
'look'n'feel';
Vertical separation. Merchants use white space to separate products,
Internally, Jango has two key components:
a component to learn vendor descriptions (i.e. learn about the s tructure of
vendor Web pages); and
a comparison shopping component, capable of comparing products across
different vendor sites.
In 'second-generation' agent mediated electronic commerce systems, it is pro-
posed that agents will be able to assist with the fourth stage of the purchasing
model set out above: negotiation. The idea is that a would-be consumer delegates
the authority to negotiate terms to a software agent. Thls agent then negotiates
with another agent (which may be a software agent or a person) in order to reach
an agreement.
There are many obvious hurdles to overcome with respect to this model: The
most important of these is trust. Consumers will not delegate the authority to
negotiate transactions to a software agent unless they trust the agent. In partic-
ular, they will need to trust that the agent (i) really understands what they want,
and (ii) that the agent is not going to be exploited ('ripped off') by another agent,
and end up with a poor agreement.
Comparison shopping agents are particularly interesting because it would seem
that, if the user is able to search the entire marketplace for goods at the best price,
then the overall effect is to force vendors to push prices as low as possible. Their
profit margins are inevitably squeezed, because otherwise potential purchasers
would go elsewhere to find their goods.
Auction bots
A highly active related area of work is auction bots: agents that can run, and
participate in, online auctions for goods. Auction bots make use of the kinds of
auction techniques discussed in Chapter 7. A well-known example is the Kasbah
system (Chavez and Maes, 1996).The aim of Kasbah was to develop a Web-based
system in which users could create agents to buy and sell goods on their behalf.
In Kasbah, a user can set three parameters for selling agents:
- desired date to sell the good by;
desired price to sell at; and
minimum price to sell at.
2 58 Applications
Selling agents in Kasbah start by offering the good at the desired price, and as the
deadline approaches, this price is systematically reduced to the minimum price
fixed by the seller. The user can specify the 'decay' function used to determine the
current offer price. Initially, three choices of decay function were offered: linear,
quadratic, and cubic decay. The user was always asked to confirm sales, giving
them the ultimate right of veto over the behaviour of the agent.
As with selling agents, various parameters could be fixed for buying agents: the
date to buy the item by, the desired price, and the maximum price. Again, the user
could specify the 'growth' function of price over time.
Agents in Kasbah operate in a marketplace. The marketplace manages a num-
ber of ongoing auctions. When a buyer or seller enters the marketplace, Kasbah
matches up requests for goods against goods on sale, and puts buyers and sellers
in touch with one another.
The Spanish Fishmarket is another example of an online auction system
(Rodriguez et al., 1997). Based on a real fishmarket that takes place in the town
of Blanes in northern Spain, the FM system provides similar facilities to Kasbah,
but is specifically modelled on the auction protocol used in Blanes.
The agent ansnrersthe phone, recognizes the callers, disturbs you when
appropriate, and may even tell a white lie on your behalf. The same
agent is well trained in timing, versed in finding opportune moments,
and respectful of idiosyncrasies. . . . If you have somebody who knows
you well and shares much of your information, that person can act on
your behalf very effectively. If your secretary falls ill, it would make no
difference if the temping agency could send you Albert Einstein. This
issue is not about IQ. It is shared knowledge and the practice of using
it in your best interests. . . . Like an army commander sending a scout
ahead.. .you will dispatch agents to collect information on your behalf.
Agents will dispatch agents. The process multiplies. But [this process]
started at the interface where you delegated your desires.
In order to construct such simulated worlds, one must first develop believable
agents: agents that 'provide the illusion of life, thus permitting the audience's
suspension of disbelief' (Bates, 1994, p. 122). A key component of such agents
is emotion: agents should not be represented in a computer game or animated
film as the flat, featureless characters that appear in current computer games.
They need to show emotions; to act and react in a way that resonates in tune with
our empathy and understanding of human behaviour. The OZ group investigated
various architectures for emotion (Bates et al., 1992a),and have developed at least
one prototype implementation of their ideas (Bates, 1994).
Conte and Gilbert (1995, p. 4) suggest that multiagent simulation of social pro-
cesses can have the following benefits:
computer simulation allows the observation of properties of a model that
may in principle be analytically derivable but have not yet been established;
possible alternatives to a phenomenon observed in nature may be found;
properties that are difficult/awkward to observe in nature may be studied
at leisure in isolation, recorded, and then 'replayed' if necessary;
'sociality' can be modelled explicitly - agents can be built that have rep-
resentations of other agents, and the properties and implications of these
representations can be investigated.
Moss and Davidsson (2001,p. 1)succinctly states a case for multiagent simulation:
[For many systems,] behaviour cannot be predicted by statistical
or qualitative analysis. . . . Analysing and designing.. .such systems
requires a different approach to software engineering and mechanism
design.
Moss goes on to give a general critique of approaches that focus on formal analy-
sis at the expense of accepting and attempting to deal with the 'messiness' that
is inherent in most multiagent systems of any complexity. It is probably fair to
say that his critique might be applied to many of the techniques described in
Chapter 7, particularly those that depend upon a 'pure' logical or game-theoretic
foundation. There is undoubtedly some strength to these arguments, which echo
cautionary comments made by some of the most vocal proponents of game theory
(Binmore, 1992, p. 196). In the remainder of this section, I will review one major
project in the area of social simulation, and point to some others.
Environment
Agent
input communication/
buffer perceived environment
dlcognitive
communication/
action
ronment; resources come in different types, and only agents of certain types are
able to obtain certain resources. Agents have a number of 'energy stores', and
for each of these a 'hunger level'. If the energy store associated with a particular
hunger level falls below the value of the hunger level, then the agent will attempt
to replenish it by consuming appropriate resources. Agents travel about the EOS
world in order to obtain resources, which are scattered about the world. Recall
that the Mellars model suggested that the availability of resources at predictable
locations and times was a key factor in the growth of the social complexity in the
Palacolithic period. To reflect this, resources (intuitively corresponding to things
like a Reindeer herd or a fruit tree) were clustered, and the rules governing the
emergence and disappearance of resources reflects this.
The basic form of social structure that emerges in EOS does so because certain
resources have associated with them a skill profile. This profile defines, for every
type of slull or capability that agents may possess, how many agents with this
skill are required to obtain the resource. For example, a 'fish' resource might
require two 'boat' capabilities; and a 'deer' resource might require a single 'spear'
capability.
In each experiment, a user may specify a number of parameters:
the number of resource locations of each type and their distribution;
the number of resource instances that each resource location comprises;
the type of energy that each resource location can supply;
the quantity of energy an instance of a particular resource can supply;
- the skill profiles for each resource; and
the 'renewal' period, which elapses between a resource being consumed and
being replaced.
To form collaborations in order to obtain resources, agents use a variation of
Smith's Contract Net protocol (see Chapter 9): thus, when an agent finds a
resource, it can advertise t h s fact by sending out a broadcast announcement.
Agents can then bid to collaborate on obtaining a resource, and the successful
bidders then work together to obtain the resource.
A number of social phenomena were observed in running the EOS testbed, for
example: 'overcrowding', when too many agents attempt to obtain resources in
some locale; 'clobbering', when agents accidentally interfere with each other's
goals; and semi-permanent groups arising. With respect to the emergence of deep
hierarchies of agents, it was determined that the growth of hierarchies depended
to a great extent on the perceptual capabilities of the group. If the group is not
equipped with adequate perceptual ability, then there is insufficient information
to cause a group to form. A second key aspect in the emergence of social structure
is the complexity of resources - how many slulls it requires in order to obtain and
exploit a resource. If resources are too complex, then groups will not be able to
form to exploit them before they expire.
Agents for X 263
An interesting aspect of the EOS project was that it highlighted the cognitive
aspects of multiagent social simulation. That is, by using EOS, it was possible
to see how the beliefs and aspirations of individuals in a society can influence
the possible trajectories of this society. One of the arguments in favour of this
style of multiagent societal simulation is that this kind of property is very hard
to model or understand using analytical techniques such as game or economic
theory (cf. the quote from Moss, above).
Agents for X
Agents have been proposed for many more application areas than I have the space
to discuss here. In this section, I will give a flavour of some of these.
264 Applications
Agents for industrial systems management. Perhaps the largest and proba-
bly best-known European multiagent system development project to date was
ARCHON (Wittig, 1992; Jennings and Wittig, 1992; Jennings et al., 1995). This
project developed and deployed multiagent technology in several industrial
domains. The most significant of these domains was a power distribution sys-
tem, which was installed and is currently operational in northern Spain. Agents in
ARCHON have two main parts: a domain component, which realizes the domain-
specific functionality of the agent; and a wrapper component, which provides
the agent functionality, enabling the system to plan its actions, and to repre-
sent and communicate with other agents. The ARCHON technology has sub-
sequently been deployed in several other domains, including particle accelera-
tor control. (ARCHON was the platform through w h c h Jennings's joint inten-
tion model of cooperation (Jennings, 1995), discussed in Chapter 9, was devel-
oped.)
Agents for Air-Traffic Control. Air-traffic control systems are among the old-
est application areas in multiagent systems (Steeb et al., 1988; Findler and Lo,
1986). A recent example is OASIS (Optimal Aircraft Sequencing using Intelligent
Scheduling), a system that is currently undergoing field trials at Sydney airport
in Australia (Ljunberg and Lucas, 1992). The specific aim of OASIS is to assist
an air-traffic controller in managing the flow of aircraft at an airport: it offers
estimates of aircraft arrival times, monitors aircraft progress against previously
derived estimates, informs the air-traffic controller of any errors, and perhaps
most importantly finds the optimal sequence in which to land aircraft. OASIS
contains two types of agents: global agents, which perform generic domain func-
tions (for example, there is a 'sequencer agent', which is responsible for arranging
aircraft into a least-cost sequence); and aircraft agents, one for each aircraft in
the system airspace. The OASIS system was implemented using the PRS agent
architecture.
Hayzelden and Bigham (1999) is a collection of articles loosely based around the
theme of agents for computer network applications; Klusch (1999) is a similar
collection centred around the topic of information agents.
Van Dyke Parunak (1987) describes the use of the Contract Net protocol
(Chapter 8) for manufacturing control in the YAMS (Yet Another Manufacturing
System). Mori et al. have used a multiagent approach to controlling a steel coil
processing plant (Mori et al., 1988), and Wooldridge et al. have described how
the process of determining an optimal production sequence for some factory can
naturally be viewed as a problem of negotiation between the various production
cells within the factory (Wooldridge et al., 1996).
A number of studies have been made of information agents, including a the-
oretical study of how agents are able to incorporate information from different
sources (Levy et al., 1994; Gruber, 1991), as well as a prototype system called IRA
(information retrieval agent) that is able to search for loosely specified articles
from a range of document repositories (Voorhees, 1994). Another important sys-
tem in this area was Carnot (Huhns et al., 1992), which allows preexisting and
heterogeneous database systems to work together to answer queries that are out-
side the scope of any of the individual databases.
There is much related work being done by the computer supported coopera-
tive work (CSCW) community. CSCW is informally defined by Baecker to be 'com-
puter assisted coordinated activity such as problem solving and communication
carried out by a group of collaborating individuals' (Baecker, 1993, p. 1).The pri-
mary emphasis of CSCW is on the development of (hardware and) software tools
to support collaborative human work - the term groupware has been coined to
describe such tools. Various authors have proposed the use of agent technology
in groupware. For example, in his participant systems proposal, Chang suggests
systems in which humans collaborate with not only other humans, but also with
artificial agents (Chang, 1987). We refer the interested reader to the collection of
papers edited by Baecker (1993) and the article by Greif (1994) for more details
on CSCW.
Noriega and Sierra (1999) is a collection of paper on agent-mediated electronic
commerce. Kephart and Greenwald (1999) investigates the dynamics of systems
in which buyers and sellers are agents.
Gilbert and Doran (1994), Gilbert and Conte (1995) and Moss and Davidsson
(2001) are collections of papers on the subject of simulating societies by means of
multiagent systems. Davidsson (2001) discusses the relationship between multi-
agent simulation and other types of simulation (e.g. object-oriented simulation
and discrete event models).
Class reading:Parunak (1999). This paper gives an overview of the use of agents
in industry from one of the pioneers of agent applications.
266 Applications
Exercises
[Level 1/Class Discussion.]
( 1)
Many of the systems discussed in this chapter (e.g. MAXIMS, NewT, Jango) do not per-
haps match up too well to the notion of an agent as I discussed it in Chapter 2 (i.e.reactive,
proactive, social). Does this matter? Do they still deserve to be called agents?
(2) [Level 4.1
Take an agent programming environment off the shelf ( e g Jam (Huber, 1999), Jack
(Busetta et al., 2000), Jade (Poggi and Rimassa, 2001) or JATLite (Jeon et al., 2000)) and,
using one of the methodologies described in the preceding chapter, use it to implement a
major multiagent system. Document your experiences, and contrast them with the expe-
riences you would expect with conventional approaches to system development. Weigh
up the pros and cons, and use them to feed back into the multiagent research and devel-
opment literature.
Logics for
Multiagent
Systems
Computer science is, as much as it is about anything, about developing formal
theories to specify and reason about computer systems. Many formalisms have
been developed in mainstream computer science to do this, and it comes as no
surprise to discover that the agents community has also developed many such
formalisms. In this chapter, I give an overview of some of the logics that have been
developed for reasoning about multiagent systems. The predominant approach
has been to use what are called modal logics to do this. The idea is to develop
logics that can be used to characterize the mental states of agents as they act and
interact. (See Chapter 2 for a discussion on the use of mental states for reasoning
about agents.)
Following an introduction to the need for modal logics for reasoning about
agents, I introduce the paradigm of normal modal logics with Kripke semantics,
as this approach is almost universally used. I then go on to discuss how these
logics can be used to reason about the knowledge that agents possess, and then
integrated theories of agency. I conclude by speculating on the way that these
formalisms might be used in the development of agent systems.
Please note: this chapter presupposes some understanding of the use
of logic and formal methods for specification and verification. It is
probably best avoided by those without such a background.
268 Logics for Multiagent Systems
since the time of Frege. He suggested a distinction between sense and reference. In
ordinary formulae, the 'reference' of a term/formula (i.e. its denotation) is needed,
whereas in opaque contexts, the 'sense' of a formula is needed (see also Seel, 1989,
p. 3).
Clearly, classical logics are not suitable in their standard form for reasoning
about intentional notions: alternative formalisms are required. A vast enterprise
has sprung up devoted to developing such formalisms.
The field of formal methods for reasoning about intentional notions is widely
reckoned to have begun with the publication, in 1962, of Jaakko Hintikka's
book Knowledge and Belief: An Introduction to the Logic of the Two Notions
(Hintikka, 1962). At that time, the subject was considered fairly esoteric, of inter-
est to comparatively few researchers in logic and the philosophy of mind. Since
then, however, it has become an important research area in its own right, with con-
tributions from researchers in AI,formal philosophy, linguistics and economics.
There is now an enormous literature on the subject, and with a major biannual
international conference devoted solely to theoretical aspects of reasoning about
knowledge, as well as the input from numerous other, less specialized confer-
ences, this literature is growing ever larger.
Despite the diversity of interests and applications, the number of basic tech-
niques in use is quite small. Recall, from the discussion above, that there are
two problems to be addressed in developing a logical formalism for intentional
notions: a syntactic one, and a semantic one. It follows that any formalism can be
characterized in terms of two independent attributes: its language of formulation,
and semantic model (Konolige, 1986, p. 83).
There are two fundamental approaches to the syntactic problem. The first is
to use a modal language, which contains non-truth-functional modal operators,
which are applied to formulae. An alternative approach involves the use of a
meta-language: a many-sorted first-order language containing terms which denote
formulae of some other object-language. Intentional notions can be represented
using a me ta-language predicate, and given whatever axiomatization is deemed
appropriate. Both of these approaches have their advantages and disadvantages,
and will be discussed at length in the sequel.
As with the syntactic problem, there are two basic approaches to the seman-
tic problem. The first, best-known, and probably most widely used approach is to
adopt a possible-worlds semantics, where an agent's beliefs, knowledge, goals, etc.,
are characterized as a set of so-called possible worlds, with an accessibility relation
holding between them. Possible-worlds semantics have an associated correspon-
dence theory which makes them an attractive mathematical tool to work with
(Chellas, 1980). However, they also have many associated difficulties, notably the
well-known logical omniscience problem, which implies that agents are perfect
reasoners. A number of minor variations on the possible-worlds theme have been
proposed, in an attempt to retain the correspondence theory, but without logical
omniscience.
2 70 Logics for Multiagent Systems
The most common alternative to the possible-worlds model for belief is to use
a sentential or interpreted symbolic structures approach. In this scheme, beliefs
are viewed as symbolic formulae explicitly represented in a data structure associ-
ated with an agent. An agent then believes p if p is present in the agent's belief
structure. Despite its simplicity, the sentential model works well under certain
circumstances (Konolige, 1986).
The next part of this chapter contains detailed reviews of some of these for-
malisms. First, the idea of possible-worlds semantics is discussed, and then a
detailed analysis of normal modal logics is presented, along with some variants
on the possible-worlds theme.
of the cognitive structure of agents. It certainly does not posit any internalized
collection of possible worlds. It is just a convenient way of characterizing belief.
Second, the mathematical theory associated with the formalization of possible
worlds is extremely appealing (see below).
The next step is to show how possible worlds may be incorporated into the
semantic framework of a logic. T h s is the subject of the next section.
(M, w) k true
(bf,w) p where p G Prop, if andonlyif p r n ( w )
(M,w) t= lg, ifandonlyif(M,ul)t#g,
(M,w) t= qvcy ifandonlyif(M,w)bpor(M,w)~cC/
(M, w) k ag, if and only if Vw' E W . if (.w;w1) E R then (M, w') k g,
w ) k Og, if and only if 3w' E W . (w,w') E R and (M, w') k g,
Figure 12.1 The semantics of normal modal logic.
of propositional logic can be defined as abbreviations in the usual way. The for-
mula up
is read 'necessarily p', and the formula O p is read 'possibly p ' . Now
to the semantics of the language.
Normal modal logics are concerned with truth at worlds; models for such logics
therefore contain a set of worlds, W, and a binary relation, R , on W, saylng whch
worlds are considered possible relative to other worlds. Additionally, a valuation
function n is required, saylng what propositions are true at each world.
A model for a normal propositional modal logic is a triple ( W , R , n ) ,where W
is a non-empty set of worlds, R G W x W, and
is a valuation function, which says for each world w E W whch atomic propo-
sitions are true in w . An alternative, equivalent technique would have been to
define n as follows:
-
n : W x P r o p {true,false),
though the rules defining the semantics of the language would then have to be
changed slightly.
The semantics of the language are given via the satisfaction relation, 'k',whch
holds between pairs of the form (M,w ) (where M is a model, and w is a reference
world), and formulae of the language. The semantic rules defining t h s relation
are given in Figure 12.1.
The definition of satisfaction for atomic propositions thus captures the idea of
truth in the 'current' world (which appears on the left of 'E'). The semantic rules
for 'true', '-', and ' v ' are standard. The rule for ' 0'
captures the idea of truth in
all accessible worlds, and the rule for '0' captures the idea of truth in at least one
possible world.
Note that the two modal operators are duals of each other, in the sense that the
universal and existential quantifiers of first-order logic are duals:
It would thus have been possible to take either one as primitive, and introduce
the other as a derived operator.
Normal Modal Logics 2 73
Correspondence theory
To understand the extraordinary properties of this simple logic, it is first neces-
sary to introduce validity and satisfiability. A formula is
satisfiable if it is satisfied for some model/world pair;
unsatisfiable if it is not satisfied by any model/world pair;
I
I
true in a model if it is satisfied for every world in the model;
I
I valid in a class o f models if it true in every model in the class;
valid if it is true in the class of all models.
If p is valid, we indicate this by writing I= p . Notice that validity is essentially the
same as the notion of 'tautology' in classical propositional logic - all tautologies
are valid.
The two basic properties of this logic are as follows. First, the following axiom
schema is valid:
I= O ( P =+ w > =+ ( U p * Ow).
This axiom is called K, in honour of Kripke. The second property is as follows.
If I= p, then I= up.
Proofs of these properties are trivial, and are left as an exercise for the reader.
Now, since K is valid, it will be a theorem of any complete axiomatization of nor-
mal modal logic. Similarly, the second property will appear as a rule of inference
in any axiomatization of normal modal logic; it is generally called the necessita-
tion rule. These two properties turn out to be the most problematic features of
normal modal logics when they are used as logics of knowledgebelief (this point
will be examined later).
The most intriguing properties of normal modal logics follow from the prop-
erties of the accessibility relation, R , in models. To illustrate these properties,
consider the following axiom schema:
It turns out that this axiom is characteristic of the class of models with a reflexive
accessibility relation. (By characteristic, we mean that it is true in all and only
those models in the class.) There are a host of axioms whlch correspond to certain
properties of R: the study of the way that properties of R correspond to axioms
is called correspondence theory. In Table 12.1, I list some axioms along with their
characteristic property on R , and a first-order formula describing the property.
A system o f logic can be thought of as a set of formulae valid in some class of
models; a member of the set is called a theorem of the logic (if p is a theorem, t h s
is usually denoted by F p ) . The notation ICE1 . . . C, is often used to denote the
smallest normal modal logic containing axioms .El,. . . ,.En (recall that any normal
modal logic will contain K ; cf. Goldblatt (1987, p. 25)).
2 74 Logics for Multiagent Systems
Condition First-order
Name Axiom onR characterization
T ~ P * V Reflexive VWEM/.(W,W)ER
For the axioms T, Dl 4, and 5, it would seem that there ought to be 16 distinct
systems of logic (since z4 = 16). However, some of these systems turn out to
be equivalent (in that they contain the same theorems), and as a result there are
only 11 distinct systems. The relationships between these systems are described
in Figure 12.2 (after Konolige (1986, p. 99) and Chellas (1980, p. 132)). In this
diagram, an arc from A to 3 means that B is a strict superset of A: every theorem
of A is a theorem of B, but not vice versa; A = B means that A and B contain
precisely the same theorems.
Because some modal systems are so widely used, they have been given names:
KT is known as T,
KT4 is known as S4,
KD45 is known as weak-S5,
KT5 is known as S5.
K i q is read 'i knows that q ' . The semantic rule for ' 0'
is replaced by the follow-
ing rule:
(M,w ) I= K i q if and only if Vwr E W . if ( w , w r ) E R i then (M,w ' ) i= p.
Each operator Ki thus has exactly the same properties as ' 0'. Corresponding to
each of the modal systems 1,above, a corresponding system 1, is defined, for
the multiagent logic. Thus K, is the smallest multiagent epistemic logic and S 5 n
is the largest.
The next step is to consider how well normal modal logic serves as a logic of
knowledge/belief. Consider first the necessitation rule and axiom K, since any
normal modal system is committed to these.
The necessitation rule tells us that an agent knows all valid formulae. Amongst
other things, this means an agent knows all propositional tautologies. Since there
are an infinite number of these, an agent will have an infinite number of items
of knowledge: immediately, one is faced with a counterintuitive property of the
knowledge operator.
2 76 Logics for Multiagen t Systems
Now consider the axiom K, which says that an agent's knowledge is closed under
implication. Suppose p is a logical consequence of the set @ = {pl,. . . , pn},
then
in every world where all of @ are true, p must also be true, and hence
must be valid. By necessitation, this formula will also be believed. Since an agent's
beliefs are closed under implication, whenever it believes each of @, it must also
believe p . Hence an agent's knowledge is closed under logical consequence. This
also seems counterintuitive. For example, suppose, like every good logician, our
agent knows Peano's axioms. It may well be that Fermat's last theorem follows
from Peano's axioms - although, it took the labour of centuries to prove it. Yet if
our agent's beliefs are closed under logical consequence, then our agent must
know it. So consequential closure, implied by necessitation and the K axiom,
seems an overstrong property for resource-bounded reasoners.
Logical omniscience
These two problems - that of knowing all valid formulae, and that of knowl-
edgebelief being closed under logical consequence - together constitute the
famous logical omniscience problem. This problem has some damaging corollar-
ies.
The first concerns consistency. Human believers are rarely consistent in the logi-
cal sense of the word; they will often have beliefs p and q ~where
, p t- i q J , without
being aware of the implicit inconsistency. However, the ideal reasoners implied by
possible-worlds semantics cannot have such inconsistent beliefs without believ-
ing every formula of the logical language (because the consequential closure of
an inconsistent set of formulae is the set of all formulae). Konolige has argued
that logical consistency is much too strong a property for resource-bounded rea-
soners: he argues that a lesser property, that of being non-contradictory, is the
most one can reasonably demand (Konolige, 1986).Non-contradiction means that
an agent would not simultaneously believe p and -p, although the agent might
have logically inconsistent beliefs.
The second corollary is more subtle. Consider the following propositions (this
example is from Konolige (1986, p. 88)).
(1) Hamlet's favourite colour is black.
(2) Hamlet's favourite colour is black and every planar map can be four
coloured.
The second conjunct of (2) is valid, and will thus be believed. This means that (1)
and (2) are logically equivalent; (2) is true just when (1)is. Since agents are ideal
reasoners, they will believe that the two propositions are logically equivalent. This
is yet another counterintuitive property implied by possible-worlds semantics, as
'equivalent propositions are not equivalent as beliefs' (Konolige, 1986, p. 88). Yet
Normal Modal Logics 2 77
Discussion
To sum up, the basic possible-worlds approach described above has the following
disadvantages as a multiagent epistemic logic:
agents believe all valid formulae;
agents' beliefs are closed under logical consequence;
equivalent propositions are identical beliefs; and
if agents are inconsistent, then they believe everything.
To which many people would add the following:
2 78 Logics for Mwltiagen t Systems
Despite these serious disadvantages, possible worlds are still the semantics of
choice for many researchers, and a number of variations on the basic possible-
worlds theme have been proposed to get around some of the difficulties - see
Wooldridge and Jennings (1995) for a survey.
-
of a set L of 'local' states. At any time, a system may therefore be in any of a set
G of global states:
G=ExLx.--xL.
n times
Next, a run of a system is a function which assigns to each time point a global
state: time points are isomorphic to the natural numbers (and time is thus discrete,
Epistemic Logic for Multiagent Systems 2 79
bounded in the past, and infinite in the future). Note that this is essentially the
same notion of runs that was introduced in Chapter 2, but I have formulated it
slightly differently. A run r is thus a function
Point = R u n x N.
A point implicitly identifies a global state. Points will serve as worlds in the logic
of knowledge to be developed. A system is a set of runs.
Now, suppose s and s' are two global states.
1
We now define a relation -i on states, for each process i,
s - i s f if and onlyif (li = I : ) .
Note that - i will be an equivalence relation. The terminology is that if s -, s', then
s and s' are indistinguishable to i, since the local state of i is the same in each
global state. Intuitively, the local state of a process represents the information
that the process has, and if two global states are indistinguishable, then it has the
same information in each.
The crucial point here is that since a processes [sic] [choice of]
actions.. .are a function of its local state, if two points are indis-
tinguishable to processor i, then processor i will perform the same
actions in each state.
(Halpern, 1987, pp. 46, 47)
where R is a system (cf. the set of runs discussed in Chapter 2), and
.rr : P o i n t - $?(Prop)
returns the set of atomic propositions true at a point. The structure ( R , n ) is
called an interpreted system. The only non-standard semantic rules are for propo-
sitions and modal formulae:
( 1 ,, u I= p where p E P r o p , if and only if p E n ( ( r u
, )),
( M v,, u ) i= K i q if and only if (M, r r ,u r )k p for all rr € R
and u r E N such that r ( u ) - i r ' ( u f )
Note that since -, is an equivalence relation (i.e. it is reflexive, symmetric, and
transitive), this logic will have the properties of the system S5,, discussed above.
In what sense does the second rule capture the idea of a processes knowledge?
The idea is that if v ( u ) - i r ' ( u r ) ,then for all i knows, it could be in either run
r, time u, or run r', time u'; the process does not have enough information to
be able to distinguish the two states. The information/knowledge it does have are
the things true in all its indistinguishable states.
In this model, knowledge is an external notion. We do not imagine
a processor scratching its head wondering whether or not it knows
a fact q,. Rather, a programmer reasoning about a particular proto-
col would say, from the outside, that the processor knew q, because
in all global states [indistinguishable] from its current state (intu-
itively, all the states the processor could be in, for all it knows), q,
is true.
(Halpern, 1986, p. 6)
Intuitively, the two generals are trying to bring about a state where it is common
knowledge between them that the message to attack was delivered. Each succes-
sive round of communication, even if successful, only adds one level to the depth
of nested belief. No amount of communication is sufficient to bring about the infi-
nite nesting that common knowledge requires. As it turns out, if communication
delivery is not guaranteed, then common knowledge can never arise in such a
scenario. Ultimately, this is because, no matter how many messages and acknowl-
edgments are sent, at least one of the generals will always be uncertain about
whether or not the last message was received.
One might ask about whether infinite nesting of common knowledge is required.
Could the two generals agree between themselves beforehand to attack after, say,
only two acknowledgments? Assuming that they could meet beforehand to come
to such an agreement, then this would be feasible. But the point is that whoever
sent the last acknowledgment would be uncertain as to whether t h s was received,
and would hence be attacking while unsure as to whether it was a coordinated
attack or a doomed solo effort.
A related issue to common knowledge is that of distributed, or implicit, knowl-
edge. Suppose there is an omniscient observer of some group of agents, with the
ability to 'read' each agent's beliefs/knowledge. Then this agent would be able
to pool the collective knowledge of the group of agents, and would generally be
able to deduce more than any one agent in the group. For example, suppose, in a
group of two agents, agent 1 only knew p, and agent 2 only knew g? * @. Then
there would be distributed knowledge of cy, even though no agent explicitly knew
w. Distributed knowledge cannot be reduced to any of the operators introduced
so far: it must be given its own definition. The distributed knowledge operator D
has the following semantic rule:
T h s rule might seem strange at first, since it uses set intersection rather than
set union, which is at odds with a naive perception of how distributed knowledge
works. However, a restriction on possible worlds generally means an increase in
knowledge.
Distributed knowledge is potentially a useful concept in cooperative problem-
solving systems, where knowledge about a problem is distributed among a group
of problem-solving agents, which must try to deduce a solution through cooper-
ative interaction.
The various group knowledge operators form a herarchy:
Cp E~~ * .. * E p * Kip * Dp.
*
See Fagin et al. (1995) for further discussion of these operators and their prop-
erties.
of the logic in developing a theory of intention. The first step is to lay out the
criteria that a theory of intention must satisfy.
When building intelligent agents - particularly agents that must interact with
humans - it is important that a rational balance is achieved between the beliefs,
goals, and intentions of the agents.
For example, the following are desirable properties of intention: an
autonomous agent should act on its intentions, not in spite of them;
adopt intentions it believes are feasible and forego those believed to be
infeasible; keep (or commit to) intentions, but not forever; discharge
those intentions believed to have been satisfied; alter intentions when
relevant beliefs change; and adopt subsidiary intentions during plan
formation.
(Cohen and Levesque, 1990a, p. 214)
Operator Meaning
in the formalism are a discrete sequence of events, stretchng infinitely into past
and future.
i The two basic temporal operators, Happens and Done, are augmented by some
L
I operators for describing the structure of event sequences, in the style of dynamic
logic (Harel, 1979).The two most important of these constructors are and '?':
I;'
-
Oi(Coal x (Later p ) ) .
The first major derived construct is a persistent goal:
(P-Coal i p ) (Coal i (Later p ) ) A
(Bel i l p ) A
Before
((Bel i p ) v (Bel i 0 7 ~ ) )
goal i (Later p ) )
So, an agent has a persistent goal of p if
(1) it has a goal that p eventually becomes true, and believes that p is not cur-
rently true; and
(2) before it drops the goal, one of the following conditions must hold:
(a) the agent believes the goal has been satisfied;
(b) the agent believes the goal will never be satisfied.
286 Logics for Multiagent Systems
(2) Illocutionary acts are complex event types, and not primitives.
Given this latter point, one must find some way of describing the actions that are
performed. Cohen and Levesque's solution is to use their logic of rational action,
which provides a number of primitive event types, which can be put together into
more complex event types, using dynamic-logic-style constructions. Illocutionary
acts are then defined as complex event types.
-
Their approach is perhaps best illustrated by giving their definition of a request.
-
Some preliminary definitions are required. First, alternating belief:
(A-Be1n x y p ) (Bel x (Bel y (Bel x
.'
- . - (Bel x p ) ...) .
v J
n times ?I times
Integrated Theories of Agency 287
Attempt x e p q 4 [ (Bel x l p )
(Coal x (Happens x e; p ? ) )
(Int x e ; q?) -1
A ?; e
In English:
An attempt is a complex action that agents perform when they do
something (e) desiring to bring about some effect ( p ) but with intent
to produce at least some result (q).
(Cohen and Levesque, 1990b, p. 240)
The idea is that p represents the ultimate goal that the agent is aiming for by
doing e; the proposition q represents what it takes to at least make an 'honest
effort' to achieve p. A definition of helpfulness is now presented:
(Helpful x y ) 2 Ve .
(Bel x (Coal y O(Done x e ) ) )
goal x [ 7 1 ( ~ o n x
e e))
+ (Coal x O(Done x e ) ) .
A
1
In English:
[Clonsider an agent [x]to be helpful to another agent [y]if, for any
action [el he adopts the other agent's goal that he eventually do that
action, whenever such a goal would not conflict with his own.
(Cohen and Levesque, 1990b1p. 230)
The definition of requests can now be given (note again the use of curly brackets:
requests are complex event types, not predicates or operators):
{Request s p k r a d d r e a} {Attempt s p k r e p
(M-Bel a d d r s p k r (Goal s p k r p ) )
I
where p is
O(Done addr cu) A
(Int a d d r a
1.
(Coal s p k r O(Done a d d r a ) )A
(Helpful a d d r s p k r ) 1
288 Logics for Multiagent Systems
In English:
A request is an attempt on the part of s p k r , by doing e, to bring about a
state where, ideally (i)addr intends a (relative to the s p k r still having
that goal, and a d d r still being helpfully inclined to s p k r ) , and (ii)a d d r
actually eventually does a,or at least brings about a state where a d d r
believes it is mutually believed that it wants the ideal situation.
By this definition, there is no primitive request act:
[A] speaker is viewed as having performed a request if he executes any
sequence of actions that produces the needed effects.
(Cohen and Levesque, 1990b, p. 246)
Comparatively few serious attempts have been made to specify real agent sys-
tems using such logics - see, for example, Fisher and Wooldridge (1997) for one
such attempt.
A specification expressed in such a logic would be a formula p.The idea is that
such a specification would express the desirable behaviour of a system. To see how
this might work, consider the following, intended to form part of a specification
of a process control system:
if
i believes valve 32 is open
then
i should intend that j should believe valve 32 is open.
Expressed in a Cohen-Levesque type logic, t h s statement becomes the formula:
(Bel i O p e n ( v u l v e 3 2 ) ) 3 (Int i (Bel j O p e n ( v a l v e 3 2 ) ) ) .
Refinement
At the time of writing, most software developers use structured but informal
techniques to transform specifications into concrete implementations. Probably
the most common techniques in widespread use are based on the idea of top-
down refinement. In t h s approach, an abstract system specification is refined
into a number of smaller, less abstract subsystem specifications, which together
satisfy the original specification. If these subsystems are still too abstract to be
implemented directly, then they are also refined. The process recurses until the
derived subsystems are simple enough to be directly implemented. Throughout,
we are obliged to demonstrate that each step represents a true refinement of the
more abstract specification that preceded it. T h s demonstration may take the
form of a formal proof, if our specification is presented in, say, Z (Spivey, 1992)
or VDM (Jones, 1990). More usually, justification is by informal argument. Object-
oriented analysis and design techniques, which also tend to be structured but
informal, are also increasingly playing a role in the development of systems (see,
for example, Booch, 1994).
For functional systems, which simply compute a function of some input and
then terminate, the refinement process is well understood, and comparatively
straightforward. Such systems can be specified in terms of preconditions and
postconditions (e.g. using Hoare logic (Hoare, 1969)). Refinement calculi exist,
Formal Methods in Agent-Oriented Software Engineering 291
which enable the system developer to take a precondition and postcondition spec-
ification, and from it systematically derive an implementation through the use of
proof rules (Morgan, 1994). Part of the reason for t h s comparative simplicity is
that there is often an easily understandable relationship between the precondi-
tions and postconditions that characterize an operation and the program struc-
tures required to implement it.
For agent systems, which fall into the category of Pnuelian reactive systems (see
the discussion in Chapter 2 ) , refinement is not so straightforward. This is because
such systems must be specified in terms of their ongoing behaviour - they cannot
be specified simply in terms of preconditions and postconditions. In contrast to
I
I precondition and postcondition formalisms, it is not so easy to determine what
program structures are required to realize such specifications. As a consequence,
researchers have only just begun to investigate refinement and design technique
for agent-based systems.
Execution of logical languages and theorem-proving are thus closely related. This
tells us that the execution of sufficiently rich (quantified)languages is not possible
(since any language equal in expressive power to first-order logic is undecidable).
Auseful way to think about execution is as if the agent is playing a game against
the environment. The specification represents the goal of the game: the agent must
keep the goal satisfied, while the environment tries to prevent the agent from
doing so. The game is played by agent and environment talung turns to build a
little more of the model. If the specification ever becomes false in the (partial)
model, then the agent loses. In real reactive systems, the game is never over:
the agent must continue to play forever. Of course, some specifications (logically
inconsistent ones) cannot ever be satisfied. A winning strategy for building models
from (satisfiable) agent specifications in the presence of arbitrary input from the
environment is an execution algorithm for the logic.
closest to our view is the work of Pnueli and Rosner (1989)on the automatic syn-
thesis of reactive systems from branching time temporal logic specifications. The
goal of their work is to generate reactive systems, whlch share many of the prop-
erties of our agents (the main difference being that reactive systems are not gen-
erally required to be capable of rational decision making in the way we described
above). To do this, they specify a reactive system in terms of a first-order branch-
ing time temporal logic formula Vx 3y Acp(x, y ) : the predicate cp characterizes
the relationship between inputs to the system (x) and outputs (y). Inputs may
be thought of as sequences of environment states, and outputs as corresponding
sequences of actions. The A is the universal path quantifier. The specification is
intended to express the fact that in all possible futures, the desired relationship
cp holds between the inputs to the system, x, and its outputs, y. The synthesis
process itself is rather complex: it involves generating a Rabin tree automaton,
and then checking t h s automaton for emptiness. Pnueli and Rosner show that
the time complexity of the synthesis process is double exponential in the size of
the specification, i.e. 0 ( 2 ~ " "), where c is a constant and n = lcp 1 is the size of
the specification cp. The size of the synthesized program (the number of states it
contains) is of the same complexity.
The Pnueli-Rosner technique is rather similar to (and in fact depends upon)
techniques developed by Wolper, Vardi, and colleagues for synthesizing Biichi
automata from linear temporal logic specifications (Vardi and Wolper, 1994).
Buchi automata are those that can recognize w-regular expressions: regular
expressions that may contain infinite repetition. A standard result in temporal
logic theory is that a formula cp of linear time temporal logic is satisfiable if and
only if there exists a Biichi automaton that accepts just the sequences that satisfy
9.Intuitively, this is because the sequences over whch linear time temporal logic
is interpreted can be viewed as w-regular expressions. This result yields a deci-
sion procedure for linear time temporal logic: to determine whether a formula cp
is satisfiable, construct an automaton that accepts just the (infinite) sequences
that correspond to models of cp; if the set of such sequences is empty, then cp is
unsatisfiable.
Similar automatic synthesis techniques have also been deployed to develop con-
current system skeletons from temporal logic specifications. Manna and Wolper
present an algorithm that takes as input a linear time temporal logic specification
of the synchronization part of a concurrent system, and generates as output a pro-
gram skeleton (based upon Hoare's CSP formalism (Hoare, 1978))that realizes the
specification (Manna and Wolper, 1984).The idea is that the functionality of a con-
current system can generally be divided into two parts: a functional part, which
actually performs the required computation in the program, and a synchroniza-
tion part, which ensures that the system components cooperate in the correct
way. For example, the synchronization part will be responsible for any mutual
exclusion that is required. The synthesis algorithm (like the synthesis algorithm
for Biichi automata, above) is based on Wolper's tableau proof method for tem-
2 94 Logics for Multiagent Systems
poral logic (Wolper, 1985). Very similar work is reported by Clarke and Emerson
(1981): they synthesize synchronization skeletons from branching time temporal
logic (CTL) specifications.
Perhaps the best-known example of t h s approach to agent development is the
situated automata paradigm of Rosenschein and Kaelbling (1996), discussed in
Chapter 5.
12.8.3 Verification
Once we have developed a concrete system, we need to show that t h s system is
correct with respect to our original specification. This process is known as verifi-
cation, and it is particularly important if we have introduced any informality into
the development process. For example, any manual refinement, done without a
formal proof of refinement correctness, creates the possibility of a faulty transfor-
mation from specification to implementation. Verification is the process of con-
vincing ourselves that the transformation was sound. We can divide approaches
to the verification of systems into two broad classes: (1) axiomatic, and (2) seman-
tic (model checlung). In the subsections that follow, we shall look at the way in
which these two approaches have evidenced themselves in agent-based systems.
Deductive verification
Axiomatic approaches to program verification were the first to enter the main-
stream of computer science, with the work of Hoare in the late 1960s (Hoare,
1969). Axiomatic verification requires that we can take our concrete program,
and from this program systematically derive a logical theory that represents the
behaviour of the program. Call this the program theory. If the program theory
is expressed in the same logical language as the original specification, then veri-
fication reduces to a proof problem: show that the specification is a theorem of
(equivalently, is a logical consequence of) the program theory.
The development of a program theory is made feasible by axiomatizing the
programming language in which the system is implemented. For example, Hoare
logic gives us more or less an axiom for every statement type in a simple Pascal-
like language. Once given the axiomatization, the program theory can be derived
from the program text in a systematic way.
Perhaps the most relevant work from mainstream computer science is the
specification and verification of reactive systems using temporal logic, in the
way pioneered by Pnueli, Manna, and colleagues (see, for example, Manna and
Pnueli, 1995). The idea is that the computations of reactive systems are infinite
sequences, which correspond to models for linear temporal logic. Temporal logic
can be used both to develop a system specification, and to axiomatize a program-
ming language. T h s axiomatization can then be used to systematically derive
the theory of a program from the program text. Both the specification and the
Formal Methods in Agent-Oriented Sofrware Engineering 295
program theory will then be encoded in temporal logic, and verification hence
becomes a proof problem in temporal logic.
Comparatively little work has been carried out within the agent-based systems
community on axiomatizing multiagent environments. I shall review just one
approach.
In Wooldridge (1992), an axiomatic approach to the verification of multiagent
systems was proposed. Essentially, the idea was to use a temporal belief logic to
axiomatize the properties of two multiagent programming languages. Given such
an axiomatization, a program theory representing the properties of the system
could be systematically derived in the way indicated above.
A temporal belief logic was used for two reasons. First, a temporal compo-
nent was required because, as we observed above, we need to capture the ongo-
ing behaviour of a multiagent system. A belief component was used because the
agents we wish to verify are each symbolic A1 systems in their own right. That is,
each agent is a symbolic reasoning system, which includes a representation of its
environment and desired behaviour. A belief component in the logic allows us to
capture the symbolic representations present within each agent.
The two multiagent programming languages that were axiomatized in the tem-
poral belief logic were Shoham's AGENTO (Shoham, 1993),and Fisher's Concurrent
MetateM (see above). The basic approach was as follows.
(1) First, a simple abstract model was developed of symbolic A1 agents. This
model captures the fact that agents are symbolic reasoning systems, capable
of communication. The model gives an account of how agents might change
state, and what a computation of such a system might look like.
(2) The histories traced out in the execution of such a system were used as the
semantic basis for a temporal belief logic. This logic allows us to express
properties of agents modelled at stage (I).
(3) The temporal belief logic was used to axiomatize the properties of a multi-
agent programming language. T h s axiomatization was then used to develop
the program theory of a multiagent system.
(4) The proof theory of the temporal belief logic was used to verify properties
of the system (cf. Fagin et al., 1995).
Note that this approach relies on the operation of agents being sufficiently sim-
ple that their properties can be axiomatized in the logic. It works for Shoham's
AGENTO and Fisher's Concurrent MetateM largely because these languages have
a simple semantics, closely related to rule-based systems, which in turn have a
simple logical semantics. For more complex agents, an axiomatization is not so
straightforward. Also, capturing the semantics of concurrent execution of agents
is not easy (it is, of course, an area of ongoing research in computer science gen-
erally).
2 96 Logics for Multiagent Systems
Model checking
Ultimately, axiomatic verification reduces to a proof problem. Axiomatic ap-
proaches to verification are thus inherently limited by the difficulty of this proof
problem. Proofs are hard enough, even in classical logic; the addition of temporal
and modal connectives to a logic makes the problem considerably harder. For this
reason, more efficient approaches to verification have been sought. One particu-
larly successful approach is that of model checking (Clarke et al., 2000). As the
name suggests, whereas axiomatic approaches generally rely on syntactic proof,
model-checking approaches are based on the semantics of the specification lan-
guage.
The model-checking problem, in abstract, is quite simple: given a formula cp
of language L, and a model M for L, determine whether or not cp is valid in M ,
i.e. whether or not M b~ cp. Verification by model checlung has been studied in
connection with temporal logic (Clarke et al., 2000). The technique once again
relies upon the close relationship between models for temporal logic and finite-
state machines. Suppose that cp is the specification for some system, and rr is a
program that claims to implement cp. Then, to determine whether or not 7-r truly
implements q , we proceed as follows:
take rr, and from it generate a model M , that corresponds to n ,in the sense
that M , encodes all the possible computations of rr;
determine whether or not M , b cp, i.e. whether the specification formula cp
is valid in M,; the program n. satisfies the specification q just in case the
answer is 'yes'.
The main advantage of model checking over axiomatic verification is in complex-
ity: model checking using the branching time temporal logic CTL (Clarke and Emer-
son, 1981)can be done in time 0 ( I q I x I M I ), where I q I is the size of the formula to
be checked, and I M I is the size of the model against which cp is to be checked - the
number of states it contains.
In Rao and Georgeff (1993), the authors present an algorithm for model-
checking BDI systems. More precisely, they give an algorithm for taking a logical
model for their (propositional)BDI logic, and a formula of the language, and deter-
mining whether the formula is valid in the model. The technique is closely based
on model-checking algorithms for normal modal logics (Clarke et al., 2000). They
show that despite the inclusion of three extra modalities (for beliefs, desires, and
intentions) into the CTL branching time framework, the algorithm is still quite
efficient, running in polynomial time. So the second step of the two-stage model-
checking process described above can still be done efficiently. Similar algorithms
have been reported for BDI-like logics in Benerecetti et al. (1999).
The main problem with model-checking approaches for BDI is that it is not clear
how the first step might be realized for BDI logics. Where does the logical model
characterizing an agent actually come from? Can it be derived from an arbitrary
program rr, as in mainstream computer science? To do this, we would need to
Formal Methods in Agent-Oriented Software Engineering 297
take a program implemented in, say, Pascal, and from it derive the belief-, desire-,
and intention-accessibility relations that are used to give a semantics to the BDI
component of the logic. Because, as we noted earlier, there is no clear relationship
between the BDI logic and the concrete computational models used to implement
agents, it is not clear how such a model could be derived.
a problem even in the propositional case. For 'vanilla' propositional logic, the
decision problem for satisfiability is NP-complete (Fagin et al., 1995, p. 72); richer
logics, or course, have more complex decision problems.
Despite these problems, the undoubted attractions of direct execution have led
to a number of attempts to devise executable logic-based agent languages. Rao
proposed an executable subset of BDI logic in his AgentSpeak(L) language (Rao,
1996a).Building on this work, Hindriks and colleagues developed the 3APL agent
programming language (Hindriks et al., 1998; Hindriks et al., 1999). Lesperance,
Reiter, Levesque, and colleagues developed the Golog language throughout the lat-
ter half of the 1990s as an executable subset of the situation calculus (Lesperance
et al., 1996; Levesque et al., 1996).Fagin and colleagues have proposed knowledge-
based programs as a paradigm for executing logical formulae which contain epis-
temic modalities (Fagin et al., 1995, 1997). Although considerable work has been
carried out on the properties of knowledge-based programs, comparatively lit-
tle research to date has addressed the problem of how such programs might be
actually executed.
Turning to automatic synthesis, the techniques described above have been
developed primarily for propositional specification languages. If we attempt to
extend these techniques to more expressive, first-order specification languages,
then we again find ourselves coming up against the undecidability of quanti-
fied logic. Even in the propositional case, the theoretical complexity of theorem-
proving for modal and temporal logics is likely to limit the effectiveness of compi-
lation techniques: given an agent specification of size 1000, a synthesis algorithm
that runs in exponential time when used offline is no more useful than an execu-
tion algorithm that runs in exponential time on-line. Kupferman and Vardi (1997)
is a recent article on automatic synthesis from temporal logic specifications.
Another problem with respect to synthesis techniques is that they typically
result in finite-state, automata-like machnes, which are less powerful than Turing
machines. In particular, the systems generated by the processes outlined above
cannot modify their behaviour at run-time. In short, they cannot learn. Whle for
many applications this is acceptable - even desirable - for equally many others, it
is not. In expert assistant agents, of the type described in Maes (1994a),learning
is pretty much the raison dJe^tre.Attempts to address this issue are described in
Kaelbling (1993).
Turning to verification, axiomatic approaches suffer from two main problems.
First, the temporal verification of reactive systems relies upon a simple model of
concurrency, where the actions that programs perform are assumed to be atomic.
We cannot make this assumption when we move from programs to agents. The
actions we t h n k of agents as performing will generally be much more coarse-
grained. As a result, we need a more realistic model of concurrency. One possibil-
ity, investigated in Wooldridge (1995),is to model agent execution cycles as inter-
vals over the real numbers, in the style of the temporal logic of reals (Barringer
et al., 1986). The second problem is the difficulty of the proof problem for agent
300 Logics for Multiagent Systems
(hence BDI architectures (Rao and Georgeff, 1991b)).However, there are alterna-
tives: Shoham, for example, suggests that the notion of choice is more funda-
mental (Shoham, 1990). Comparatively little work has yet been done on formally
comparing the suitability of these various combinations. One might draw a par-
allel with the use of temporal logics in mainstream computer science, where the
expressiveness of specification languages is by now a well-understood research
area (Emerson and Halpern, 1986). Perhaps the obvious requirement for the short
term is experimentation with real agent specifications, in order to gain a better
understanding of the relative merits of different formalisms.
More generally, the kinds of logics used in agent theory tend to be rather elabo-
rate, typically containing many modalities which interact with each other in subtle
ways. Very little work has yet been carried out on the theory underlying such logics
(perhaps the only notable exception is Catach (1988)).Until the general principles
and limitations of such multi-modal logics become understood, we might expect
that progress with using such logics will be slow. One area in which work is likely
to be done in the near future is theorem-proving techniques for multi-modal log-
ics.
Finally, there is often some confusion about the role played by a theory of
agency. The view we take is that such theories represent specificationsfor agents.
The advantage of treating agent theories as specifications, and agent logics as
specification languages, is that the problems and issues we then face are familiar
from the discipline of software engineering. How useful or expressive is the spec-
ification language? How concise are agent specifications? How does one refine or
otherwise transform a specification into an implementation? However, the view
of agent theories as specifications is not shared by all researchers. Some intend
their agent theories to be used as knowledge representation formalisms, which
raises the difficult problem of algorithms to reason with such theories. Still oth-
ers intend their work to formalize a concept of interest in cognitive science or
philosophy (this is, of course, what Hintikka intended in his early work on logics
of knowledge of belief). What is clear is that it is important to be precise about
the role one expects an agent theory to play.
Class reading: Rao and Georgeff (1992). This paper is not too formal, but is
focused on the issue of when a particular agent implementation can be said to
implement a particular theory of agency.
302 Logics for Multiagent Systems
Exercises
( 1 ) [Level 1.1
Consider the attitudes of believing, desiring, intending, hoping, and fearing. For each
of the following.
(a) Discuss the appropriateness of the axioms K, T, D, 4, and 5 for these attitudes.
(b) Discuss the interrelationships between these attitudes. For example, if Biq means
'i believes q' and I i q means 'i intends q ' , then should liq 3 B i q hold? What
about Iiq 3 B i l q or lip 3 1 B i - l ~and so on? Systematically draw u p a table of
these possible relationships, and informally argue for/against them - discuss the
circumstances under which they might be acceptable.
( c ) Add temporal modalities into the framework (as in Cohen and Levesque's formal-
ism), and carry out the same exercise.
decide whether or not the entity at the other end is either another person or
a machine. If such a test cannot distinguish a particular program from a person,
then, Turing argued, the program must be considered intelligent to all intents and
purposes. Clearly, we can think of the program at the other end of the teletype as
an agent - the program is required to respond, in real time, to statements made
by the person, and the rules of the test prohibit interference with the program. It
exhbits some autonomy, in other words.
Although the idea of an agent was clearly present in the early days of AI, there
was comparatively little development in the idea of agents as holistic entities
(i.e. integrated systems capable of independent autonomous action) until the mid-
1980s; below, we will see why t h s happened.
Between about 1969 and 1985, research into systems capable of independent
action was carried out primarily within the AI planning community, and was dom-
inated by what I will call the 'reasoning and planning' paradigm. A1 planning (see
Chapter 3) is essentially automatic programming: a planning algorithm takes as
input a description of a goal to be achieved, a description of how the world cur-
rently is, and a description of a number of available actions and their effects. The
algorithm then outputs a plan - essentially, a program - w h c h describes how the
available actions can be executed so as to bring about the desired goal. The best-
known, and most influential, planning algorithm was the STRIPS system (Fikes
and Nilsson, 1971). STRIPS was so influential for several reasons. First, it devel-
oped a particular notation for describing actions and their effects that remains
to this day the foundation for most action representation notations. Second, it
emphasized the use of formal, logic-based notations for representing both the
properties of the world and the actions available. Finally, STRIPS was actually
used in an autonomous robot called Shakey at Stanford Research Institute.
The period between the development of STRIPS and the mid-1980s might be
thought of as the 'classic' period in AI planning. There was a great deal of progress
in developing planning algorithms, and understanding the requirements for rep-
resentation formalisms for the world and actions. At the risk of over generalizing,
this work can be characterized by two features, both of which were pioneered in
the STRIPS system:
the use of explicit, symbolic representations of the world;
mon approaches, but certain themes did recur in this work. Recurring themes were
the rejection of architectures based on symbolic representations, an emphasis on
a closer coupling between the agent's environment and the action it performs,
and the idea that intelligent behaviour can be seen to emerge from the interaction
of a number of much simpler behaviours.
The term 'furore' might reasonably be used to describe the response from the
symbolic and logical reasoning communities to the emergence of behavioural AI.
Some researchers seemed to feel that behavioural AI was a direct challenge to the
beliefs and assumptions that had shaped their entire academic careers. Not sur-
prisingly, they were not predisposed simply to abandon their ideas and research
programs.
I do not believe there was (or is) a clear cut 'right' or 'wrong' in this debate.
With the benefit of hindsight, it seems clear that much symbolic A1 research had
wandered into the realms of abstract theory, and did not connect in any realistic
way with the reality of building and deploymg agents in realistic scenarios. It also
seems clear that the decomposition of A1 into components such as planning and
learning, without any emphasis on synthesizing these components into an inte-
grated architecture, was perhaps not the best strategy for A1 as a discipline. But
it also seems that some claims made by members of the behavioural community
were extreme, and in many cases suffered from the over-optimism that A1 itself
suffered from in its early days.
The practical implications of all this were threefold.
The first was that the behavioural A1 community to a certain extent split
off from the mainstream AI community. Taking inspiration from biological
metaphors, many of the researchers in behavioural AI began working in a
community that is today known as 'artificial life' (alife).
The second was that mainstream A1 began to recognize the importance of
integrating the components of intelligent behaviour into agents, and, from
the mid-1980s to the present day, the area of agent archtectures has grown
steadily in importance.
The third was that within AI, the value of testing and deploymg agents in
realistic scenarios (as opposed to simple, contrived, obviously unrealistic
scenarios) was recognized. This led to the emergence of such scenarios as
the RoboCup robotic soccer challenge, in which the aim is to build agents
that can actually play a game of soccer against a team of robotic opponents
(RoboCup, 2001).
So, by the mid-1980s, the area of agent architectures was becoming established
as a specific research area within AI itself.
Most researchers in the agent community accept that neither a purely logicist
or reasoning approach nor a purely behavioural approach is the best route to
building agents capable of intelligent autonomous action. Intelligent autonomous
Appendix A 307
action seems to imply the capability for both reasoning and reactive behaviour.
As Innes Ferguson succinctly put it (Ferguson, 1992~1,
p. 31):
It is both desirable and feasible to combine suitably designed delibera-
tive and non-deliberative control functions to obtain effective, robust,
and flexible behaviour from autonomous, task-achieving agents oper-
ating in complex environments.
This recognition led to the development of a range of hybrid architectures, which
attempt to combine elements of both behavioural and deliberative systems. At
the time of writing, hybrid approaches dominate in the literature.
blackboard
f /
knowledge knowledge knowledge
source 1 source 2 source n
Figure A.l A blackboard architecture: a number of knowledge sources encapsulate
knowledge about a problem, and communicate by reading and writing on a shared data
structure known as a blackboard.
sources each monitoring the blackboard and writing to it when they can contribute
partial problem solutions. The blackboard metaphor was neatly described by Alan
Newel1 long before the blackboard model became widely known:
Metaphorically we can think of a set of workers, all loolung at the same
blackboard: each is able to read everything that is on it, and to judge
when he has something worthwhile to add to it. This conception is.. .a
set of demons, each independently looking at the total situation and
shrieking in proportion to what they see fits their natures
(Newell, 1962)
The first, and probably best-known, blackboard system was the Hearsay system
for speech understanding, developed in the early 1970s under the supervision of
Reddy et al. (1973). One of Reddy's co-workers on the Hearsay project was Vic-
tor ('Vic') Lesser, who moved to the University of Massachusetts at Amherst in
1977. Lesser had worked on multiprocessing computer systems in the 1960s,
and was well aware of the potential value of parallelism. He recognized that the
blackboard model, with its multiple knowledge sources each contributing partial
solutions to the overall problem, provided a natural metaphor for problem solving
Appendix A 309
he notion of objects communicating by message passing was a key idea in early object-orientcd
programming systems, but has been somewhat obscured in languages such as C++ and Java. Message
passing in Smalltalk was essentially method invocation.
310 Appendix A
ing itself was inexorably tied to distributed, open systems (Hewitt, 1985),and that
traditional models of computation were not well suited for modelling or under-
standing such distributed computation. The ACTOR paradigm was his attempt to
develop a model of computation that more accurately reflected the direction in
which computer science was going.
An ACTOR is a computational system with the following properties.
ACTORs are social - they are able to send messages to other ACTORs.
ACTORs are reactive - they carry out computation in response to a message
received from another ACTOR. (ACTORs are thus message driven.)
Intuitively, an actor can be considered as consisting of
a mail address which names the ACTOR;
a behaviour, which specifies what the ACTOR will do upon receipt of a mes-
sage.
The possible components of an ACTOR'S behaviour are
sending messages to itself or other ACTORs;
creating more actors;
specifying a replacement behaviour.
Intuitively, the way an ACTOR works is quite simple:
upon receipt of a message, the message is matched against the ACTOR'S
behaviour (script);
upon a match, the corresponding action is executed, which may involve
sending more messages, creating more ACTORs, or replacing the ACTOR
by another.
An example ACTOR, which computes the factorial of its argument, is shown in
Figure A.2 (from Agha, 1986). Receipt of a message containing a non-zero integer
n by F a c t o r i a1 will result in the following behaviour:
create an ACTOR whose behaviour will be to multiply n by an integer it
receives and send the reply to the mail address to which the factorial of n
was to be sent;
send itself the 'request' to evaluate the factorial of n - 1 and send the value
to the customer it created.
The creation of ACTORs in this example mirrors the recursive procedures for
computing factorials in more conventional programming languages.
The ACTOR paradigm greatly influenced work on concurrent object languages
(Agha et al., 1993).Particularly strong communities working on concurrent object
languages emerged in France (led by Jacques Ferber and colleagues (Ferber and
Appendix A
R e c - F a c t o r i a1 w i t h acquai ntances s e l f
l e t communication be an i n t e g e r n and a customer u
become R e c - F a c t o r i a1
i f n=O
then
send [I] t o customer
e l se
l e t c=Rec-Customer w i t h acquaintances n and u
{send [n-1, m a i l address o f c] t o s e l f )
Rec-Customer w i t h acquaintances i n t e g e r n and customer u
l e t communication be an i n t e g e r k
{send [n * k] t o u)
Carle, 1991))and Japan (led by Akinora Yonezawa, Mario Tokoro and colleagues
(Yonezawa and Tokoro, 1997; Yonezawa, 1990; Sueyoshi and Tokoro, 1991)).
In the late 1970s at Stanford University in California, a doctoral student called
Reid Smith was completing his PhD on a system called the Contract Net, in which
a number of agents ('problem solving nodes' in Smith's parlance) solved problems
by delegating sub-problems to other agents (Smith, 1977, 1980a,b).As the name
suggests, the key metaphor is that of sub-contracting in human organizations.
The Contract Net remains to this day one of the most influential multiagent sys-
tems developed. It introduced several key concepts into the multiagent systems
literature, including the economics metaphor and the negotiation metaphor.
Smith's thesis was published in 1980, a year also notable for the emergence of
the first academic forum for research specifically devoted to the new paradigm of
multiagent systems. Randy Davis from MIT organized the first workshop on what
was then called 'Distributed Artificial Intelligence' (DAI) (Davis, 1980).Throughout
the 1980s, the DAI workshops, held more or less annually in the USA, became the
main focus of activity for the new community. The 1985 workshop, organized
by Michael Genesereth and Matt Ginsberg of Stanford University, was particularly
important as the proceedings were published as the first real book on the field: the
'green book', edited by Michael Huhns (Huhns, 1987).The proceedings of the 1988
workshop, held at Lake Arrowhead in California, were published two years later
as the 'second green book' (Gasser and Huhns, 1989). Another key publication at
this time was Bond and Gasser's 1988 collection Readings in Distributed Artificial
Intelligence (Bond and Gasser, 1988). This volume brought together many of the
key papers of the field. It was prefaced with a detailed and insightful survey article,
which attempted to summarize the key problems and issues facing the field; the
survey remains relevant even at the time of writing.
Until about the mid-1980s the emphasis was on 'parallelism in problem solv-
ing', or distributed problem solving as it became known. In other words, the main
type of issue being addressed was 'given a problem, how can we exploit multiple
312 Appendix A
was the spread of the Internet, which through the 1990s changed from being
a tool unknown outside academia to something in everyday use for commerce
and leisure across the globe. In many ways, 1994 seems to have been a mile-
stone year for agents. The first is that 1994 was the year that the Web emerged;
the Mosaic browser only began to reach a truly wide audience in 1994 (Berners-
Lee, 1999). The Web provided an easy-to-use front end for the Internet, enabling
people with very limited IT training to productively use the Internet for the first
time. The explosive growth of the Internet was perhaps the most vivid illustra-
tion possible that the future of computing lay in distributed, networked systems,
and that in order to exploit the potential of such distributed systems, new mod-
els of computation were required. By the summer of 1994 it was becoming clear
that the Internet would be a major proving ground for agent technology (perhaps
even the 'luller application'), although the full extent of this interest was not yet
apparent.
As well as the emergence of the Web, 1994 saw the publication in July of a spe-
cial issue of Communications of the ACM that was devoted to intelligent agents.
CACM is one of the best-known publications in the computing world, and ACM
is arguably its foremost professional body; the publication of a special issue of
CACM on agents was therefore some lund of recognition by the computing world
that agents were worth knowing about. Many of the articles in this special issue
described a new type of agent system, that acted as a lund of 'expert assistant'
to a user working with a particular class of application. The vision of agents as
intelligent assistants was perhaps articulated most clearly by Pattie Maes from
MIT's Media Lab, who described a number of prototype systems to realize the
vision (Maes, 1994a).Such user interface agents rapidly caught the imagination of
a wider community, and, in particular, the commercial possibilities of such tech-
nologies was self-evident. A number of agent startup companies were founded
to commercialize this technology, including Pattie Maes' s company FireFly (sub-
sequently sold to Microsoft), and Oren Etzioni's company NetBot (subsequently
sold to the Web portal company Excite).
With the growth of the Internet in the late 1990s came electronic commerce
(e-commerce), and the rapid international expansion of 'dot com' companies. It
was quickly realized that e-commerce represents a natural - and potentially very
lucrative - application domain for multiagent systems. The idea is that agents
can partially automate many of the stages of electronic commerce, from finding a
product to buy, through to actually negotiating the terms of agreement (Noriega
and Sierra, 1999). This area of agent-mediated electronic commerce became per-
haps the largest single application area for agent technology by the turn of the
century, and gave an enormous impetus (commercial, as well as scientific) to the
areas of negotiation and auctions in agent systems. Researchers such as Sarit
Kraus, Carles Sierra, Tuomas Sandholm, Moshe Tennenholtz, and Makoto Yokoo
investigated the theoretical foundations of agent-mediated electronic commerce
(building to a great extent on the pioneering work of Jeff Rosenschein and col-
3 14 Appendix A
I began t h s book by pointing to some trends that have so far marked the short
history of computer science: ubiquity, interconnection, intelligence, delegation,
and human-orientation. I claimed that these trends naturally led to the emergence
of the multiagent systems paradigm. I hope that after reading this book, you will
agree with this claim.
After opening this book by tallung about the history of computing, you may
expect me to close it by talking about its future. But prediction, as Nils Bohr
famously remarked, is hard - particularly predicting the future. Rather than mak-
ing specific predictions about the future of computing, I will therefore restrict my
observations to some hopefully rather uncontentious (and safe)points. The most
important of these is simply that these trends will continue. Computer systems
will continue to be ever more ubiquitous and interconnected; we will continue to
delegate ever more tasks to computers, and these tasks will be increasingly com-
plex, requiring ever more intelligence to successfully carry them out; and, finally,
the way in whch we interact with computers will increasingly resemble the way
in which we interact with each other.
Douglas Adams, author of the well-known Hitch Hiker's Guide to the Galaxy
books, was also, in the final years of his life, a commentator on the computer
industry. In a radio programme broadcast by the BBC shortly before his death, he
predicted that, eventually, computers and processor power will become as cheap
and common as grains of sand. Imagine such a world: in which every device cre-
ated by humans is equipped with processor power, and is capable of interacting
with any other device, or any person, anywhere in the world. Outlandish - pre-
posterous - as it may seem, t h s future follows directly from the trends that I
discussed above. Now imagine the potential in this vision. Those of us old enough
to have worked with computers before 1993 will recall the sense of awe as we
realized what might be possible with the Web. But this pales into insignificance
next to the possibilities of t h s , as yet distant future Internet.
Note that the plumbing for this future - the processors and the network connec-
tions to link them - is the easy part. The difficult part - the real challenge - is the
software to realize its potential. I do not know exactly what software technologies
will be deployed to make this future happen. But it seems to me - and to many
other researchers - that multiagent systems are the best candidate we currently
have. It does not matter whether we call them agents or not; in 20 years, the term
318 Afterword
may not be used. The key thing is that the problems being addressed by the agent
community are exactly the problems that I believe will need to be solved to realize
the potential.
References
Bradshaw, J., Dutfield, S., Benoit, P. and Wooley, J. D. (1997) KAoS: towards an industrial
strength open agent architecture. In Software Agents (ed. J. Bradshaw), pp. 375-418.
MIT Press, Cambridge, MA.
Bratman, M. E. (1987) Intention, Plans and Practical Reason. Harvard University Press, Cam-
bridge, MA.
Bratman, M. E. (1990) What is intention? In Intentions in Communication (eds P. R. Cohen,
J. L. Morgan and M. E. Pollack), pp. 15-32. MIT Press, Cambridge, MA.
Bratman, M. E., Israel, D. J. and Pollack, M. E. (1988) Plans and resource-bounded practical
reasoning. Computational Intelligence, 4, 349-35 5.
Brazier, F. et al. (1995) Formal specification of multi-agent systems: a real-world case. In
Proceedings of the I st International Conference on Multi-Agent Systems (ICMAS-951,San
Francisco, CA, pp. 2 5-32.
Bretier, P. and Sadek, D. (1997) A rational agent as the kernel of a cooperative spoken
dialogue system: implementing a logical theory of interaction. In Intelligent Agents, III
(eds J. P. Miiller, M. Wooldridge and N. R. Jennings), LNAI Volume 1193, pp. 189-204.
Springer, Berlin.
Breugst, M. et al. (1999) Grasshopper: an agent platform for mobile agent-based services
in fixed and mobile telecommunications environments. In Software Agents for Future
Communication Systems (eds A. L. G. Hayzelden and J. Bigham), pp. 326-357. Springer,
Berlin.
Brewington, B. ct al. (1999) Mobile agents for distributed information retrieval. In Intelli-
gent Information Agents (ed. M. Klusch), pp. 355-395. Springer, Berlin.
Brooks, R. A. (1986) A robust layered control system for a mobile robot. IEEE.Journa1o f
Robotics and Automation, 2(1), 14-23.
Brooks, R. A. (1990) Elephants don't play chess. In Designing Autonomous Agents (ed.
P. Maes), pp. 3-1 5. MIT Press, Cambridge, MA.
Brooks, R. A. (1991a) Intelligence without reason. In Proceedings of the 12th International
Joint Conference on Artificial Intelligence (IJCAI-91),Sydney, Australia, pp. 569-595.
Brooks, R. A. (1991b) Intelligence without representation. Artificial Intelligence, 47, 139-
159.
Brooks, R. A. (1999) Cambrian Intelligence. MIT Press, Cambridge, MA.
Burmeister, B. (1996)Models and methodologies for agent-oriented analysis and design. In
Working Notes of the H'96 Workshop on Agent-Oriented Programming and Distributed
Systems (ed. K. Fischer), DFKI Document D-96-06. DFKI.
Busetta, P. et al. (2000) Structuring BDI agents in functional clusters. In Intelligent Agents,
W. Proceedings of the 6th International Workshop on Agent Theories, Architecfures and
Languages, ATAL-99 (eds N. Jennings and Y. Lesperance), LNAI Volume 1757. Lecture
notes in Artificial Intelligence, pp. 277-289. Springer, Berlin.
Bylander, T. (1994)The computational complexity of propositional STRIPS planning. Arti-
ficial Intelligence, 69(1-2), 165-204.
Carnrnarata, S., McArthur, D. and Steeb, R. (1983) Strategies of cooperation in distributed
problem solving. In Proceedings of the 8th International Joint Conference on Artificial
Intelligence (IJCAI-83),Karlsruhe, Federal Republic of Germany, pp. 767-770.
Carriero, N. and Gelernter, D. (1989) Linda in context. Communications o f the ACM, 32(4),
444-458.
Castelfranchi, C. (1990) Social power. In DecentralizedAI - Proceedings of the 1st European
Workshop on Modelling Autonomous Agents in a Multi-Agent World (MAclMAW-89)(eds
Y . Demazeau and J.-P. Miiller), pp. 49-62. Elsevier, Amsterdam.
322 References
Conte, R. and Gilbert, N. (1995) Computer simulation for social theory. In Artificial Soci-
eties: The Computer Simulation o f Social Life (eds N . Gilbert and R. Conte),pp. 1-1 5. UCL
Press, London.
Corkill, D. D., Gallagher, K. Q. and Johnson, P. M. (1987)Achieving flexibility, efficiency and
generality in blackboard architectures. In Proceedings of the 6th National Conference on
Artificial Intelligence (AAAI-87),pp. 18-23. Seattle, WA.
.
DAML (2001) The DARPA agent markup language. See h t t p : //www dam1 .org/.
Davidsson, P. (2001) Multi agent based simulation: beyond social simulation. In Multi-
Agent-Based Simulation, LNAI Volume 1979, pp. 97-107. Springer, Berlin.
Davis, R. (1980) Report on the workshop on Distributed AI. ACM SIGAKT Newsletter, 73,
42-52.
Decker, K. S. (1996) TEMS: A framework for environment centred analysis and design of
coordination algorithms. In Foundations of Distributed Artificial Intelligence (eds G. M. P.
O'Hare and N. R. Jennings), pp. 429-447. John Wiley and Sons, Chichester.
Decker, K. and Lesser, V. (1995)Designing a family of coordination algorithms. In Proceed-
ings o f the 1st International Conference on Multi-Agent Systems (ICMAS-95),pp. 73-80.
San Francisco, CA.
Decker, K. S., Durfee, E. H. and Lesser, V. R. (1989) Evaluating research in cooperative
distributed problem solving. In Distributed Artificial Intelligence (eds L. Gasser and
M. Huhns), Volume 11, pp. 487-519. Pitman, London and Morgan Kaufmann, San Mateo,
CA.
Decker, K., Sycara, K. and Williamson, M. (1997) Middle-agents for the Internet. In Pro-
ceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97),
Nagoya, Japan.
Decker, S., (2000) The semantic Web: the roles of XML and RDF. IEEE Internet Computing,
4(5), 63-74.
Demazeau, Y.and Miiller, J.-P. (eds) (1990) Decentralized AI - Proceedings of the 1st Euro-
pean Workshop on Modelling Autonomous Agents in a Multi-Agent World (MAAMAW-89).
Elsevier, Amsterdam.
Dennett, D. C. (1978) Brainstorms. MIT Press, Cambridge, MA.
Dennett, D. C. (1987) The Intentional Stance. MIT Press, Cambridge, MA.
Dennett, D. C. (1996) Kinds of Minds. London: Phoenix.
Depke, R., Heckel, R. and Kuester, J. M. (2001) Requirement specification and design
of agent-based systems with graph transformation, roles and uml. In Agent-Oriented
Sofhvare Engineering - Proceedings o f the 1st International Workshop AOSE-2000 (eds
P. Ciancarini and M. Wooldridge), LNCS Volume 1957, pp. 105-1 20. Springer, Berlin.
Devlin, K. (1991) Logic and Information. Cambridge University Press, Cambridge.
Dignum, F. (1999) Autonomous agents with norms. Artificial Intelligence and Law, 7 , 69-
79.
Dignum, F. and Greaves, M. (eds) (2000) Issues in Agent Communication, LNAI Vol-
ume 1916. Springer, Berlin.
Dimpoulos, Y., Nebel, B. and Toni, F. (1999) Preferred arguments are harder to compute
than stable extensions. In Proceedings of the 16th International joint Conference on
Artificial Intelligence (IJCAI-991,pp. 36-4 1. Stockholm, Sweden.
d'Inverno, M. and Luck, M. (1996) Formalising the contract net as a goal-directed system.
In Agents Breaking Away - Proceedings o f the 7th European Workshop on Modelling
Autonomous Agents in a Multi-Agent World, MAAMAW-96 (LNAI 1038) (eds J . van de
Velde and W. Perram), pp. 72-85. Springer, Berlin.
dlInverno,M. and Luck, M. (2001) Understanding Agent Systems. Springer, Berlin.
324 References
EBAY (2001) The eBay online marketplace. See h t t p : //www . ebay . corn/.
Eliasmith, C. (1999) Dictionary of the philosophy of mind. Online at h t t p : / / www .a r t s c i .
w u s t l .edu/"phi l o s / M i ndDi c t /
Emerson, E. A. (1990) Temporal and modal logic. In Handbook o f Theoretical Computer
Science. Volume B: Formal Models and Semantics (ed. J. van Leeuwen), pp. 996-1072.
Elsevier, Amsterdam.
Emerson, E. A. and Halpern, J. Y. (1986)'Sometimes' and 'not never' revisited: onbranching
time versus linear time temporal logic. Journal of the ACM, 33(1), 151-178.
Enderton, H. B. (1972) A Mathematical Introduction to Logic. Academic Press, London.
Engelmore, K and Morgan, T. (eds) (1988) Blackboard Systems. Addison-Wesley, Reading,
MA.
Ephrati, E. and Rosenschein, J. S. (1993) Multi-agent planning as a dynamic search for
social consensus. In Proceedings of the 13th International Joint Conference on Artificial
Intelligence (IJCAI-93),Chambery, France, pp. 423-429.
Erman, L. D. et al. (1980)The Hearsay-I1speech-understanding system: integrating knowl-
edge to resolve uncertainty. Computing Surveys, 12(2),2 13-2 5 3.
Etzioni, 0 . (1993) Intelligence without robots. AI Magazine, 14(4).
Etzioni, 0 . (1996)Moving up the information food chain: deploying softbots on the World
Wide Web. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-
96), Portland, OR, pp. 4-8.
Etzioni, 0 . and Weld, D. S. (1995)Intelligent agents on the Internet: fact, fiction and fore-
cast. IEEE Expert, 10(4),44-49.
Fagin, R. et al. (1995) Reasoning About Knowledge. MIT Press, Cambridge, MA.
Fagin, R. et al. (1997)Knowledge-basedprograms. Distributed Computing, 10(4),199-225.
Fagin, R., Halpern, J. Y. and Vardi, M. Y. (1992)What can machines know? On the properties
of knowledge in distributed systems. Journal of the ACM, 39(2),328-3 76.
Farquhar, A., Fikes, R. and Rice, J. (1997) The Ontolingua server: a tool for collaborative
ontology construction. International Journal of Human-Computer Studies, 46, 707-72 7.
Fennell, R. D. and Lesser, V. R. (1977)Parallelism in Artificial Intelligence problem solving:
a case study of Hearsay 11. IEEE Transactions on Computers, C 26(2),98-1 11. (Also pub-
lished in Readings in Distributed Artificial Intelligence (eds A. H. Bond and L. Gasser),
pp. 106-119. Morgan Kaufmann, 1988.)
Fensel, D. and Musen, M. A. (2001)The semantic Web. IEEEIntelligent Systems, 16(2),24-25.
Fensel, D. et al. (2001) The semantic Web. IEEE Intelligent Systems, 16(2),24-25.
Ferber, J. (1996) Reactive distributed artificial intelligence. In Foundations of Distributed
Artificial Intelligence (eds G. M . P. O'Hare and N. R. Jennings), pp. 287-3 17. John Wiley
and Sons, Chichester.
Ferber, J. (1999) Multi-Agent Systems. Addison-Wesley, Reading, MA.
Ferber, J. and Carle, P. (1991)Actors and agents as reflective concurrent objects: a MERING
IV perspective. IEEE Transactions on Systems, Man and Cybernetics.
Ferguson, I. A. (1992a) TouringMachines: an Architecture for Dynamic, Rational, Mobile
Agents. PhD thesis, Clare Hall, University of Cambridge, UK. (Also available as technical
report no. 273, University of Cambridge Computer Laboratory.)
Ferguson, I. A. (1992b)Towards an architecture for adaptive, rational, mobile agents. In De-
centralized AI 3 - Proceedings of the 3rd European Workshop on Modelling Autonomous
Agents in a Multi-Agent World (MAAMAW-91)(eds E. Werner and Y. Demazeau), pp. 249-
262. Elsevier, Amsterdam.
Ferguson, I. A. (1995)Integrated control and coordinated behaviour: a case for agent mod-
els. In Intelligent Agents: Theories, Architectures and Languages (eds M. Wooldridge and
N. R. Jennings), LNAI Volume 890, pp. 203-218. Springer, Berlin.
326 References
Fikes, R. E. and Nilsson, N. (1971) STRIPS: a new approach to the application of theorem
proving to problem solving. Artificial Intelligence, 2, 189-208.
Findler, N. V. and Lo, R. (1986) An examination of Distributed Planning in the world of air
traffic control. Journal of Parallel and Distributed Computing, 3.
Findler, N. and Malyankar, R. (1993) Alliances and social norms in societies of non-
homogenous, interacting agents. In Simulating Societies-93: Pre-proceedings of the 1993
International Symposium on Approaches to Simulating Social Phenomena and Social Pro-
cesses, Certosa di Pontignano, Siena, Italy.
Finin, T. et al. (1993) Specification of the KQML agent communication language. DARPA
knowledge sharing initiative external interfaces working group.
FIPA (1999) Specification part 2 - agent communication language. The text refers to the
specification dated 16 April 1999.
FIPA (2001) The foundation for intelligent physical agents. See h t t p ://w.f ipa. org/.
Firby, J. A. (1987) An investigation into reactive planning in complex domains. In Proceed-
ings of the 10th International Joint Conference on Artificial Intelligence (IJCAI-87),Milan,
Italy, pp. 202-206.
Fischer, K., Muller, J. P. and Pischel, M. (1996) A pragmatic BDI architecture. In Intelligent
Agents, 11(eds M. Wooldridge, J. P. Miiller and M. Tambe), LNAl Volume 1037, pp. 203-
2 18. Springer, Berlin.
Fisher, M. (1994) A survey of Concurrent MetateM-the language and its applications. In
Temporal Logic - Proceedings of the 1st International Conference (eds D. M. Gabbay and
H. J. Ohlbach), LNAI Volume 827, pp. 480-505. Springer, Berlin.
Fisher, M. (1995) Representing and executing agent-based systems. In Intelligent Agents:
Theories, Architectures and Languages (eds M. Wooldridge and N. R. Jennings), LNAI
Volume 890, pp. 307-323. Springer, Berlin.
Fisher, M. (1996) .4n introduction to executable temporal logic. The Knowledge Engineering
Review, 11(1),43-56.
Fisher, M. and Wooldridge, M. (1997) On the formal specification and verification of multi-
agent systems. International Journal of Cooperative Information Systems, 6(1), 37-65.
Fox, J., Krause, P. and Ambler, S. (1992) Arguments, contradictions and practical reason-
ing. In Proceedings of the 10th European Conference on Artificial Intelligence (ECAI-921,
Vienna, Austria, pp. 623-627.
Francez, N. (1986) Fairness. Springer, Berlin.
Franklin, S. and Graesser, A. (1997) Is it an agent, or just a program? In Intelligent Agents,
I11 (eds J. P. Miiller, M. Wooldridge and N. R. Jennings), LNAI Volume 1193, pp. 2 1-36.
Springer, Berlin.
Freeman, E., Hupfer, S. and Arnold, K. (1999)JavaSpaces Principles, Patterns and Practice.
Addison-Wesley, Reading, MA.
Gabbay, D. (1989)Declarative past and imperative future. In Proceedings of the Colloquium
on Temporal Logic in Specification (eds B. Banieqbal, H. Barringer and A. Pnueli), LNCS
Volume 398, pp. 402-450. Springer, Berlin.
Galliers, J. R. (1988a) A strategic framework for multi-agent cooperative dialogue. In Pro-
ceedings of the 8th European Conference on Artificial Intelligence (ECAI-88),pp. 41 5-420.
Munich, Germany.
Galliers, J. R. (1988b) A Theoretical Framework for Computer Models of Cooperative Dia-
logue, Acknowledging Multi-Agent Conflict. PhD thesis, The Open University, UK.
Galliers, J. R. (1990) The positive role of conflict in cooperative multi-agent systems. In
Decentralized41 - Proceedings of the First European Workshop on Modelling Autonomous
Agents in a Multi-Agent World (MAAMAW-89)(eds Y. Demazeau and J.-P.Miiller), pp. 33-
48. Elsevier, Amsterdam.
References 32 7
Gilbert, N. and Conte, R. (eds) (1995) Artificial Societies: the Computer Simulation o f Social
Life. UCL Press, London.
Gilbert, N. and Doran, J. (eds) (1994) Simulating Societies. UCL Press, London.
Gilkinson, H., Paulson, S. F. and Sikkink, D. E. (1954) Effects of order and authority in
argumentative speech. Quarterly Journal of Speech, 40, 183- 192.
Ginsberg, M. L. (1989) Universal planning: an (almost) universally bad idea. A1 Magazine,
10(4),40-44.
Ginsberg, M. L. (1991)Knowledge interchange format: the KIF of death. AIMagazine, 12(3),
57-63.
Gmytrasiewicz, P. and Durfee, E. H. (1993) Elements of a utilitarian theory of knowledge
and action. In Proceedings of the 13th International Joint Conference on Artificial Intel-
ligence (IJCAI-93),Chambery, France, pp. 396-402.
Goldberg, A. (1984) SMALLTALK-80: the Interactive Programming Language. Addison-
Wesley, Reading, MA.
Goldblatt, R. (1987) Logics of Time and Computation (CSLI Lecture Notes number 7). Cen-
ter for the Study of Language and Information, Ventura Hall, Stanford, CA 94305. (Dis-
tributed by Chicago University Press.)
Goldman, C. V. and Rosenschein, J. S. (1993) Emergent coordination through the use of
cooperative state-changing rules. In Proceedings of the 12th International Workrhop on
Distributed Artificial Intelligence (IWDAI-93),Hidden Valley, PA, pp. 171-186.
Goldman, R. P. and Lang, R. R. (1991) Intentions in time. Technical report TUTR 93-101,
Tulane University.
Gray, R. S. (1996) Agent Tcl: a flexible and secure mobile agent system. In Proceedings o f
the 4th Annual Tcl/Tk Workshop, Monterrey, CA, pp. 9-23.
Greif, I. (1994) Desktop agents in group-enabled products. Communications of the ACM,
37(7), 100-105.
Grosz, B. and Kraus, S. (1993) Collaborative plans for group activities. In Proceedings o f
the 13th International Joint Conference on Artificial Intelligence (IJCAI-93),Chambery,
France, pp. 367-373.
Grosz, B. J. and Kraus, S. (1999) The evolution of SharedPlans. In Foundations o f Rational
Agency (eds M. Wooldridge and A. Rao), pp. 227-262. Kluwer Academic, Boston, MA.
Grosz, B. J. and Sidner, C. L. (1990) Plans for discourse. In Intentions in Communication
(eds P. R. Cohen, J. Morgan and M. E. Pollack), pp. 417-444. MIT Press, Cambridge, MA.
Gruber, T. R. (1991) The role of common ontology in achieving sharable, reusable knowl-
edge baaLs. In Proceedings of Knowledge Representation and Reasoning (KR & R-91) (eds
R. Fikes and E. Sandewall). Morgan Kaufmann, San Mateo, CA.
Guilfoyle, C., Jeffcoate, J. and Stark, H. (1997)Agents on the Web: Catalyst for E-Commerce.
Ovum Ltd, London.
Guttman, R. H., Moukas, A. G. and Maes, P. (1998) Agent-mediated electronic commerce:
a survey. The Knowledge Engineering Review, l3(2), 147- 159.
Haddadi, A. (1996) Communication and Cooperation in Agent Systems, LNAI Volume 1056.
Springer, Berlin.
Halpern, J. Y. (1986) Reasoning about knowledge: an overview. In Proceedings o f the 1986
Conference on Theoretical Aspects of Reasoning About Knowledge (ed. J . Y . Halpern),
pp. 1- 18. Morgan Kaufmann, San Mateo, CA.
Halpern, J. Y. (1987) Using reasoning about knowledge to analyze distributed systems.
Annual Review o f Computer Science, 2, 3 7-68.
Halpern, J. Y. and Moses, Y. (1992) A guide to completeness and complexity for modal
logics of knowledge and belief. Artificial Intelligence, 54, 3 19-379.
Halpern, J. Y. and Vardi, M. Y. (1989) The complexity of reasoning about knowledge and
time. I. Lower bounds. Journal o f Computer and System Sciences, 38, 195-237.
References 329
Harel, D. (1979) First-Order Dynamic Logic, LNCS Volume 68. Springer, Berlin.
Harel, D. (1984)Dynamic logic. InHandbook of Philosophical Logic. II. Extensions o f Classical
Logic (eds D. Gabbay and F. Guenther), pp. 497-604. D. Reidel, Dordrecht. (Synthese
Library, Volume 164.)
Harel, D., Kozen, D. and Tiuryn, J. (2000) Dynamic Logic. MIT Press, Cambridge, MA.
Haugeneder, H., Steiner, D. and McCabe, F. G. (1994) IMAGINE: a framework for building
multi-agent systems. In Proceedings of the 1994 International Working Conference on
Cooperating Knowledge Based Systems (CKBS-941,DAKE Centre, University o f Keele, UK
(ed. S . M. Deen), pp. 31-64.
Hayes-Roth, B. (198 5) A blackboard architecture for control. Artificial Intelligence, 26,25 1-
321.
Hayes-Roth, F., Waterman, D. A. and Lenat, D. B. (eds) (1983) Building Expert Systems.
Addison-Wesley, Reading, M A .
Hayzelden, A. L. G. and Bigham, J. (eds) (1999) Software Agents for Future Communication
Systems. Springer, Berlin.
Hendler, J. (2001) Agents and the semantic Web. IEEE Intelligent Systems, 16(2),30-37.
Hewitt, C. (1971) Description and Theoretical Analysis (Using Schemata) o f PLANNEK: a
Language for Proving Theorems and Manipulating Models in a Robot. PhD thesis, Arti-
ficial Intelligence Laboratory, Massachusetts Institute of Technology.
Hewitt, C. (1973)A universal modular ACTOR formalism for AI. In Proceedings o f the 3rd
InternationalJoint Conference on Artificial Intelligence (IJCAI-731,Stanford, CA, pp. 2 3 5-
245.
Hewitt, C. (1977) Viewing control structures as patterns of passing messages. Artificial
Intelligence, 8(3), 323-364.
Hewitt, C. (1985) The challenge of open systems. Byte, 10(4),223-242.
Hewitt, C. E. (1986) Offices are open systems. ACM Transactions on Office Information
Systems, 4(3), 271-287.
Hindriks, K. V., de Boer, F. S., van der Hoek, W. and Meyer, J.-J. C. (1998) Formal semantics
for an abstract agent programming language. In Intelligent Agents, W (eds M. P. Singh,
A. Rao and M. J. Wooldridge), LNAI Volume 1365, pp. 215-230. Springer, Berlin.
Hindriks, K. V. et al. (1999) Control structures of rule-based agent languages. In Intelligent
Agents, V (eds J. P. Muller, M. P. Singh and A. S. Rao), LNAI Volume 1 5 55. Springer, Berlin.
Hintikka, J. (1962) Knowledge and Belie6 Cornell University Press, Ithaca, NY.
Hoare, C. A. R. (1969)An axiomatic basis for computer programming. Communications o f
the ACM, 12(10), 576-583.
Hoare, C. A. R. (1978) Communicating sequential processes. Communications of the ACM,
21, 666-677.
Holzmann, G. (1991) Design and Validation of Computer Protocols. Prentice-Hall Interna-
tional, Heme1 Hempstead, UK.
Huber, M. (1999) Jam: a BDI-theoretic mobile agent architecture. In Proceedings o f the 3rd
International Conference on Autonomous Agents (Agents 991, Seattle, WA, pp. 236-243.
Hughes, G. E. and Cresswell, M. J. (1968) Introduction to Modal Logic. Methuen and Co.,
Ltd.
Huhns, M. (ed.)(1987) Distributed ArtificiaiIntelligence. Pitman, London and Morgan Kauf-
mann, San Mateo, CA.
Huhns, M. N. (2001) Interaction-oriented programming. In Agent-Oriented Software Engi-
neering - Proceedings of the 1st International Workshop AOSE-2000 (eds P. Ciancarini
and M. Wooldridge), LNCS Volume 1957, pp. 29-44. Springer, Berlin.
Huhns, M. and Singh, M. P. (eds) (1998) Keadings in Agents. Morgan Kaufmann, San Mateo,
CA.
330 References
Knabe, F. C. (1995) Language Support for Mobile Agents. PhD thesis, School of Computer
Science, Carnegie-Mellon University, Pittsburgh, PA. (Also published at technical report
CMU-CS-95-223.)
Konolige, K. (1986) A Deduction Model o f Belief: Pitman, London and Morgan Kaufmann,
San Mateo, CA.
Konolige, K. (1988) Hierarchic autoepistemic theories for nonmonotonic reasoning: pre-
liminary report. In Nonmonotonic Reasoning - Proceedings o f the Second International
Workshop (eds M. Reinfrank et al.),LNAI Volume 346, pp. 42-59. Springer, Berlin.
Konolige, K. and Pollack, M. E. (1993) A representationalist theory of intention. In Pro-
ceedings o f the 13th International Joint Conference on Artificial Intelligence (IJCAI-93),
pp. 390-395, Chambery, France.
Kotz, D. et al. (1997) Agent Tcl: targeting the needs of mobile computers. IEEE Internet
Computing, 1(4),58-67.
Kraus, S. (1997) Negotiation and cooperation in multi-agent environments. Artificial Intel-
ligence, 94(1-2), 79-98.
Kraus, S. (2001) Strategic Negotiation in Multiagent Environments. MIT Press, Cambridge,
MA.
Kraus, S. and Lehmann, D. (1988) Knowledge, belief and time. Theoretical Computer Sci-
ence, 58, 155-174.
Kraus, S., Sycara, K. and Evenchik, A. (1998)Reaching agreements through argumentation:
a logical model and implementation. Artificial Intelligence, 104, 1-69.
Krause, P. et al. (1995) A logic of argumentation for reasoning under uncertainty. Compu-
tational Intelligence, 11, 113-1 3 1.
Kripke, S. (1963) Semantical analysis of modal logic. Zeitschrift fiir Mathematische Logik
und Grundlagen der Mathematik, 9, 67-96.
Kuokka, D. R. and Harada, L. P. (1996)Issues and extensions for information matchmaking
protocols. International Journal of Cooperative Information Systems, 5(2-3), 251-274.
Kupferman, 0 . and Vardi, M. Y. (1997) Synthesis with incomplete informatio. In Proceed-
ings o f the 2nd International Conference on Temporal Logic, Manchester, UK, pp. 91-1 06.
Labrou, Y., Finin, T. and Peng, Y. (1999) Agent communication languages: the current
landscape. IEEE Intelligent Systems, 14(2),45-52.
Lander, S., Lesser, V. R. and Connell, M. E. (1991) Conflict resolution strategies for cooper-
ating expert agents. In CKBS-90 - Proceedings o f the International Working Conference
on Cooperating Knowledge Based Systems (ed. S. M. Deen), pp. 183-200. Springer, Berlin.
Lange, D. B. and Oshima, M. (1999) Programming and Deploying Java Mobile Agents with
Aglets. Addison-Wesley, Reading, MA.
Langton, C. (ed.) (1989) Artificial Life. Santa Fe Institute Studies in the Sciences of Com-
plexity. Addison-Wesley, Reading, MA.
Lenat, D. B. (1975) BEINGS: knowledge as interacting experts. In Proceedings of the 4th
International Joint Conference on ArtificialIntelligence (IJCAI-751,Stanford, CA, pp. 126-
133.
Lesperance, Y. et al. (1996) Foundations of a logical approach to agent programming. In
Intelligent Agents, II (eds M. Wooldridge, J. P. Miiller and M. Tambe), LNAI Volume 1037,
pp. 33 1-346. Springer, Berlin.
Lesser, V. R. and Corkill, D. D. (1981) Functionally accurate, cooperative distributed sys-
tems. IEEE Transactions on Systems, Man and Cybernetics, 11(1),81-96.
Lesser, V. R. and Corkill, D. D. (1988) The distributed vehicle monitoring testbed: a tool
for investigating distributed problem solving networks. In Blackboard Systems (eds
R. Engelmore and T. Morgan), pp. 353-386. Addison-Wesley, Reading, MA.
Lesser, V. R. and Erman, L. D. (1980) Distributed interpretation: a model and experiment.
IEEE Transactions on Computers, C 29(l2), 1 144- 1163.
References 333
Mayfield, J., Labrou, Y. and Finin, T. (1996) Evaluating KQML as an agent communication
language. In Intelligent Agents, I1 (eds M. Wooldridge, J. P. Muller and M. Tambe), LNAI
Volume 1037, pp. 34 7-360. Springer, Berlin.
Merz, M., Lieberman, B. and Lamersdorf, W. (1997) Using mobile agents to support inter-
organizational workflow management. Applied Artificial Intelligence, 1 l(6), 5 5 1-5 72.
Meyer, J.-J. C. and van der Hoek, W. (1995) Epistemic Logic for AI and Computer Science.
Cambridge University Press, Cambridge.
Meyer, J.-J. C. and Wieringa, R. J. (eds)(1993)Deontic Logic in Computer Science -Normative
System Specification. John Wiley and Sons, Chichester.
Milner, R. (1989) Communication and Concurrency. Prentice-Hall, Englewood Cliffs, NJ.
Mitchell, M. (1996) An Introduction to Genetic Algorithms. MIT Press, Cambridge, MA.
Moore, R. C. (1990)A formal theory of knowledge and action. In Readings in Planning (eds
J. F. Allen, J. Hendler and A. Tate), pp. 480-5 19. Morgan Kaufmann, San Mateo, CA.
Mor, Y. and Rosenschein, J. S. (1995) Time and the prisoner's dilemma. In Proceedings o f
the 1st International Conference on Multi-Agent Systems (ICMAS-95),San Francisco, CA,
pp. 276-282.
Mora, M. et al. (1999) BDI models and systems: reducing the gap. In Intelligent Agents, V
(eds J. P. Miiller, M. P. Singh and A. S. Rao), LNAI Volume 1555. Springer, Berlin.
Morgan, C. (1994) Programming from Specifications (2nd edition). Prentice-Hall Interna-
tional, Heme1 Hempstead, UK.
Mori, K., Torikoshi, H., Nakai, K. and Masuda, T. (1988) Computer control system for iron
and steel plants. Hitachi Review, 37(4), 2 51-2 58.
Moss, S. and Davidsson, P. (eds) (2001) Multi-Agent-Based Simulation, LNAI Volume 1979.
Springer, Berlin.
Mullen, T. and Wellman, M. P. (1995) A simple computational market for network informa-
tion services. In Proceedings of the l st International Conference on Multi-Agent Systems
(ICMAS-95),San Francisco, CA, pp. 283-289.
Mullen, T. and Wellman, M. P. (1996) Some issues in the design of market-oriented agents.
In Intelligent Agents, 11 (eds M. Wooldridge, J. P. Muller and M. 'l'ambe), LNAI Vol-
ume 1037, pp. 283-298. Springer, Berlin.
Muller, J. (1997) A cooperation model for autonomous agents. In Intelligent Agents, III
(eds J. P. Muller, M. Wooldridge and N. R. Jennings), LNAI Volume 1193, pp. 245-260.
Springer, Berlin.
Muller, J. P. (1999)The right agent (architecture)to do the right thing. In Intelligent Agents,
V (eds J. P. Muller, M. P. Singh and A. S. Rao), LNAI Volume 1555. Springer, Berlin.
Muller, J. P., Pischel, M. and Thiel, M. (1995) Modelling reactive behaviour in vertically
layered agent architectures. In Intelligent Agents: Theories, Architectures and Languages
(eds M. Wooldridge and N. R. Jennings),LNAI Volume 890, pp. 261-276. Springer, Berlin.
Muller, J. P., Wooldridge, M. and Jennings, N. R. (eds) (1997) Intelligent Agents, III, LNAI
Volume 1 193. Springer, Berlin.
Muscettola, N. et al. (1998) Remote agents: to boldly go where no A1 system has gone
before. Artificial Intelligence, 103, 5-47.
NEC (2001) Citeseer: The NECI scientific literature digital library. See h t t p : / / c i t e s e e r .
nj . nec. corn/.
Negroponte, N. (1995) Being Digital. Hodder and Stoughton, London.
Neumann, J. V. and Morgenstern, 0. (1944) Theory of Games and Economic Behaviour.
Princeton University Press, Princeton, NJ.
Newell, A. (1962) Some problems of the basic organisation in problem solving programs.
In Proceedings o f the 2nd Conference on Self-organizing Systems (eds M. C. Yovits, G. T.
Jacobi and G. D. Goldstein), pp. 393-423. Spartan Books, Washington, DC.
Newell, A. (1982) The knowledge level. Artificial Intelligence, 18(1),87-127.
References 335
Newell, A. (1990) Unified Theories of Cognition. Harvard University Press, Cambridge, MA.
Newell, A., Rosenbloom, P. J. and Laird, J. E. (1989) Symbolic architectures for cognition.
In Foundations of Cognitive Science (ed. M. I. Posner). MIT Press, Cambridge, MA.
NeXT Computer Tnc. (1993) Object-Oriented Programming and the Objective C Language.
Addison-Wesley, Reading, MA.
Nilsson, N. J. (1992) Towards agent programs with circuit semantics. 'Technical report
STAN-CS-92-1412, Computer Science Department, Stanford University, Stanford, CA
94305.
Nodine, M. and Unruh, A. (1998) Facilitating open communication in agent systems: the
Infosleuth infrastructure. In Intelligent Agents, IV (eds M. P. Singh, A. Rao and M. J.
Wooldridge), LNAI Volume 1365, pp. 281-296. Springer, Berlin.
Noriega, P. and Sierra, C. (eds) (1999) Agent Mediated Electronic Commerce, LNAI Vol-
ume 157 1. Springer, Berlin.
Norman, T. J. and Long, D. (1995) Goal creation in motivated agents. In Intelligent Agents:
Theories, Architectures and Languages (eds M. Wooldridge and N. R. Jennings), LNAI
Volume 890, pp. 277-290. Springer, Berlin.
Oaks, S. and Wong, H. (2000) Jini in a Nulshell. O'Reilly and Associates, Inc.
Odell, J., Parunak, H. V. D. and Bauer, B. (2001) Representing agent interaction proto-
cols in UML. In Agent-Oriented Software Engineering - Proceedings of the First h t e m a -
tional Workshop AOSE-2000 (eds P. Ciancarini and M. Wooldridge), LNCS Volume 1957,
pp. 12 1-140. Springer, Berlin.
.
OMG (2001) The Object Management Group. See h t t p ://www omg . org/.
Omicini, A. (2001) SODA: societies and infrastructures in the analysis and design of
agent-based systems. In Agent-Oriented Software Engineering - Proceedings of the First
International Workshop AOSE-2000 (eds P. Ciancarini and M. Wooldridge), LNCS Vol-
ume 1957, pp. 185-194. Springer, Berlin.
Oshuga, A. et al. (1997) Plangent: an approach to making mobile agents intelligent. IEEE
Internet Computing, 1(4),50-57.
Ousterhout, J. K. (1994) Tcl and the Tk Toolkit. Addison-Wesley, Reading, MA.
Ovum (1994) Intelligent agents: the new revolution in software.
Papadimitriou, C. H. (1994) Computational Comp1exit)l.Addison-Wesley, Reading, MA.
Papazoglou, M. P., Laufman, S. C. and Sellis, T. K. (1992) An organizational framework
for cooperating intelligent information systems. Journal of Intelligent and Cooperative
Information Systems, 1(1),169-202.
Parsons, S. and Jennings, N. R. (1996) Negotiation through argumentation - a prelimi-
nary report. In Proceedings of the 2nd International Conference on Multi-Agent Systems
(ICMAS-96),Kyoto, Japan, pp. 267-2 74.
Parsons, S., Sierra, C. A. and Jennings, N. R. (1998) Agents that reason and negotiate by
arguing. Journal of Logic and Computation, 8(3), 261-292.
Parunak, H. V. D. (1999)Industrial and practical applications of DAI. In Multi-AgentSystems
(ed. G. WeiR), pp. 377-421. MIT Press, Cambridge, MA.
Patil, R. S. et at. (1992) The DARPA knowledge sharing effort: progress report. In Proceed-
ings of Knowledge Representation and Reasoning (KR&R-92)(eds C. Rich, W. Swartout
and B. Nebel), pp. 777-788.
Perloff, M. (1991) STIT and the language of agency. Synthese, 86, 379-408.
Perrault, C. R. (1990) An application of default logic to speech acts theory. In Intentions in
Communication (eds P. R. Cohen, J. Morgan and M. E. Pollack), pp. 161-186. MIT Press,
Cambridge, MA.
Perriolat, F., Skarek, P., Varga, L. Z, and Jennings, N. R. (1996)Using archon: particle accel-
erator control. IEEE Expert, 11(6),80-86.
336 References
Pham, V. A. and Karmouch, A. (1998) Mobile software agents: an overview. IEEE Commu-
nications Magazine, pp. 26-37.
Pitt, J. and Mamdani, E. H. (1999) A protocol-based semantics for an agent communica-
tion language. In Proceedings of the 16th International Joint Conference on Artificial
Intelligence (IJCAI-991,Stockholm, Sweden.
Pnueli, A. (1986) Specification and development of reactive systems. In Information Pro-
cessing 86, pp. 845-858. Elsevier, Amsterdam.
Pnueli, A. and Rosner, R. (1989)On the synthesis of a reactive module. In Proceedings of the
16th ACM Symposium on the Principles of Programming Languages (POPL), pp. 179-190.
Poggi, A. and Rimassa, G. (2001) Adding extensible synchronization capabilities to the
agent model of a FIPA-compliant agent platform. In Agent-Oriented Software Engineer-
ing - Proceedings of the First International Workshop AOSE-2000 (eds P. Ciancarini and
M. Wooldridge), LNCS Volume 1957, pp. 307-322. Springer, Berlin.
Pollack, M. E. (1990) Plans as complex mental attitudes. In Intentions in Communication
(eds P. R. Cohen, J. Morgan and M. E. Pollack), pp. 77-104. MIT Press, Cambridge, MA.
Pollack, M. E. (1992) The uses of plans. Artificial Intelligence, 57(1),43-68.
Pollack, M. E. and Ringuette, M. (1990) Introducing the Tileworld: experimentally evalu-
ating agent architectures. In Proceedings of the 8th National Conference on Artificial
Intelligence ( M I - g o ) , Boston, MA, pp. 183-1 89.
Pollock, J. L. (1992) How to reason defeasibly. Artificial Intelligence, 57, 1-42.
Pollock, J. L. (1994) Justification and defeat. Artificial Intelligence, 67, 377-407.
Poundstone, W. (1992) Prisoner's Dilemma. Oxford University Press, Oxford.
Power, R. (1984) Mutual intention. Journal for the Theory of Social Behaviour, 14, 85-102.
Prakken, H. and Vreeswijk, G. (2001) Logics for defeasible argumentation. In Handbook
of Philosophical Logic (eds D. Gabbay and F. Guenther), 2nd edition. Kluwer Academic,
Boston, MA.
Rao, A. S. (1996a) AgentSpeak(L):BDI agents speak out in a logical computable language.
In Agents Breaking Away: Proceedings of the 7th European Workshop on Modelling
Autonomous Agents in a Multi-Agent World (eds W. Van de Velde and J. W. Perram),
LNAI Volume 1038, pp. 42-55. Springer, Berlin.
Rao, A. S. (1996b) Decision procedures for propositional linear-time Belief-Desire-Inten-
tion logics. In Intelligent Agents, II (eds M. Wooldridge, J. P. Miiller and M. Tambe), LNAI
Volume 1037, pp. 33-48. Springer, Berlin.
Rao, A. S. and Georgeff, M. P. (1991a) Asymmetry thesis and side-effect problems in linear
time and branching time intention logics. In Proceedings of the 12th International Joint
Conference on Artificial Intelligence (IJCAI-91), Sydney, Australia, pp. 498-504.
Rao, A. S. and Georgeff, M. P. (1991b) Modeling rational agents within a BDI-architecture.
In Proceedings of Knowledge Representation and Reasoning (KR&R-91)(eds R. Fikes and
E. Sandewall), pp. 473-484. Morgan Kaufmann, San Mateo, CA.
Rao, A. S. and Georgeff, M. P. (1992)An abstract architecture for rational agents. In Proceed-
ings of Knowledge Representation and Reasoning (KR&R-92) (eds C. Rich, W. Swartout
and B. Nebel), pp. 439-449.
Rao, A. S. and Georgeff, M. P. (1993) A model-theoretic approach to the verification of
situated reasoning systems. In Proceedings of the 13th International Joint Conference
on Artificial Intelligence (IJCAI-93), Chambery, France, pp. 318-324.
Rao, A. S. and Georgeff, M. P. (1995) Formal models and decision procedures for multi-
agent systems. Technical note 61, Australian A1 Institute, level 6, 171 La Trobe Street,
Melbourne, Australia.
References 337
MultiAgent
Systems
M I C H A E L W O O L D R I D G E
-
An lntroductlm to MultiAgent Systems is a
must-read.