Learning For DAI Systems
Learning For DAI Systems
1. Introduction
Over the last four decades, machine learning's primary interest has been single
agent learning. In general, single agent learning involves improving the performance or
increasing the knowledge of a single agent [5]. An improvement in performance or an
increase in knowledge allows the agent to solve past problems with better quality or
efficiency. An increase in knowledge may also allow the agent to solve new problems.
An increase in performance is not necessarily due to an increase in knowledge. It may be
brought about simply by rearranging the existing knowledge or utilizing it in a different
manner. In addition, new knowledge may not be employed immediately but may be
accumulated for future use.
Single agent learning systems may be classified according to their underlying
learning strategies. These strategies are ordered according to the amount of inferencing or
the degree of knowledge transformation required by the learning system. This order also
reflects the increasing amount of effort required by the learning system and the
decreasing effort required by the teacher. These strategies are separated into the
following six categories [10]:
1. Rote Learning - This strategy does not require the learning system to transform or
infer knowledge. It includes learning by imitation, simple memorization and
learning by being programmed. In this context, a system may simply memorize
previous solutions and recall them when confronted with the same problem.
2. Learning from Instruction - This strategy, also called learning by being told,
requires the learning system to select and transform knowledge into a usable form
and then integrate it into the existing knowledge of the system. It includes learning
from teachers and learning by using books, publications and other types of
instruction.
3. Learning by Deduction - Using this strategy, the learning system derives new
facts from existing information or knowledge by employing deductive inference.
1
These truth-preserving inferences include transforming knowledge into more
effective forms and determining important new facts or consequences.
Explanation-based Learning is an example of deductive learning.
4. Learning by Analogy - This form requires the learning system to transform and
supplement its existing knowledge from one domain or problem area into new
domain or problem areas.This strategy requires more inferencing by the learning
system than previous strategies. Relevant knowledge must be found in the system's
existing knowledge by using induction strategies. This knowledge must then be
transformed or mapped to the new problem using deductive inference strategies.
6. Learning from Observation and Discovery - Using this strategy, the learning
system must either induce class descriptions from observing the environment or
manipulate the environment to acquire class descriptions or concepts. This
unsupervised form of learning requires the greatest amount of inferencing among
all of the different forms of learning.
2
only need to learn to work together and not necessarily improve their individual
performance. In addition, not all of the agents must be able to learn or adapt to allow the
group to improve.
This paper will examine the learning potential for different dimensions of
distributed artificial intelligence systems. This exploratory study will be concerned with
adapting and learning at the knowledge and organizational levels. Several existing
systems will be examined and classified according to the dimensions for learning. This
paper will not examine general dimensions for DAI, but only those dimensions that can
be used for examining learning in a DAI system.
1. Control Learning - Learning and adapting to work with other agents involves
adjusting the control of each agent's problem solving plan or agenda. Different tasks
may have to be solved in a specific sequence. If the tasks are assigned to separate
agents, the agents must work together to solve the tasks. Learning which agents are
typically assigned different types of tasks will allow each agent to select other
agents to work with on different tasks. Teams can be formed based on the type of
task to be solved. Some of the issues involved are the type, immediacy and
importance of task, as well as each agent's task solving ability, capability, reliability
and past task assignments. Each team member's plan would be adjusted according
to the other agent's plans.
3
3. Dimensions for Multiple Agent Learning
Learning for DAI systems can be divided along two issues: learning about
individual agents and learning about groups of agents. The information learned about the
individual agents can be used in learning about groups. To learn about individual agents,
some method of measuring a single agent's performance must be possible. Likewise, to
learn about groups of agents, some method of measuring a group's performance must be
possible.
Learning and modeling the knowledge each agent possesses, which includes the
current knowledge base and current data the agent senses from its surroundings, allows
agents to directly query the correct agent instead of broadcasting a query to all agents.
Agents may be able to solve none, one, many or all types of tasks. By learning
which tasks each agent can solve, task allocation becomes simpler.
Agents also have different problem solving capabilities. By learning how many
tasks each agent can accomplish, over or underloading of agents can be avoided.
Depending on the tasks assigned and the agent's ability, different non-local
information will be needed by different agents. Learning what non-local information each
agent will need for each type of task allows an estimate of the communication and
performance cost for assigning a task to an agent. Information can include partial
solutions. By realizing the partial solutions necessary to solve a task, the control of
different agents can be adjusted to assist in efficient planning of tasks.
Agents will also differ in the quality of their solutions and their reliability.
Important tasks can be assigned to better agents.
Agent Knowledge
Knowledge Base
Data
Agent Task Solving
Ability
Capability
Control
Agent Non-Local Information
Agent Quality
Solution Quality
Reliability
4
3.2 Learning about Groups of Agents
4. Examples
5
4.2 The Learning Contract Net
The next example is an extension of Smith and Davis' Contract Net [3,4]. The
Contract Net's motivation was to allow opportunistic, adaptive task allocation among a
group of agents. Each agent can assume two different roles: manager and contractor. A
manager monitors task execution and processes task results. A contractor executes tasks.
An agent decomposes a problem into tasks. The agent then assumes the role of
manager and announces each task by broadcasting a task announcement to the entire
group of agents. The other agents then evaluate these announcements in relationship to
their own capabilities and submit bids on the tasks each is able to solve. The manager
then selects one or several agents and informs the successful bidders through an award
message. The selected agents then assume the role of contractor and execute the task. A
task awarded to a contractor in this manner may also be decomposed into several tasks
and subsequently the contractor becomes a manager of these new tasks.
Initially, no information is known about any of the agents in the Contract Net.
However, as problems are solved, information about agents can be learned. This proposal
is called the Learning Contract Net.
After a problem is decomposed into tasks and the tasks are announced, agents
submit bids on the tasks each is able to solve. These bids allow the system to learn each
agent's task solving ability. Each time a task is announced, a record of each agent's bid is
recorded for the particular task type. In subsequent task announcements, the
announcement is not broadcast to all agents but to only a small group that has responded
to similar announcements in earlier problems. When new agents become active, they
receive every type of announcement until their abilities are learned.
After tasks have been awarded, each agent's capability can be determined by
comparing its performance and task bidding behavior to its task load. Some agents may
not bid for new tasks after being awarded one; others may continue to bid for new tasks.
In addition, the performance of an agent may dramatically decrease when it becomes
overloaded. By recording an agent's performance compared to the type of task assigned,
the ability of the agent can be measured. In addition, by recording an agent's performance
compared to the number of tasks assigned, the agent's capability can be also be measured.
This is similar to Shaw and Whinston's task awarding payoff system [11]. From learning
each agent's task solving ability and capability, the system might eventually switch from
task announcement and bidding to task assignment based on agent ability and load.
Initially, as tasks are being solved, each agent will have to broadcast queries for
information to all agents. By noting which agents respond to certain queries, agents can
slowly learn which agents to query for different types of information.
By learning each agent's knowledge and data along with task ability and
capability, an group organization could be formed to solve specific problems. Initially,
the group of agents would be organized as a decentralized market as described by Malone
[8]. Once the functional ability of each agent has been learned, different types of
organizations could be attempted and the performance of each measured. In this sense,
the system would start with a completely decentralized organization with connections
between every agent and evolve into either a centralized market, a product or functional
hierarchy. The agents would start as autonomous agents and eventually move to
master/slave relationships.
6
4.3 Shaw and Whinston
Shaw and Whinston also propose an extension to the Contract Net [11]. In their
system, when an agent is awarded a task, it receives a payoff proportional to its bid. The
payoff increases the successful agent's strength. Likewise, the managing agent that
awarded the bid decreases its strength by the amount of the payoff. Strength records a
measure of an agent's past performance. In addition, an agent's strength affects its ability
to bid for tasks in the future. An agent's strength, as well as its task specialization and
readiness, are used in determining its bid for a specific task. As a result, stronger agents,
those having successfully completed more tasks, are increasingly favored in the bidding
process.
Their system also uses genetic operators to mutate existing agents or produce new
agents from two existing agents. Weaker agents are mutated or transformed to
incorporate useful characteristics and capabilities from other, stronger agents. Whereas
new agents are produced by combining characteristics and capabilities from two existing
agents.
Each agent's characteristics and capabilities are represented in the form of
chromosomes. The chromosomes are implemented using a string of 0's, meaning the
agent is not capable of performing the operation, and 1's, meaning the agent is capable of
performing the operation. The specific operation is denoted by its position in the
chromosome. While it appears there is no increase in new knowledge, an improvement in
performance is brought about by rearranging the existing knowledge and utilizing it in a
different manner.
7
Knos can also exist across several different contexts associated with separate
workstations. These are termed complex Knos and can be used for coordinating tasks
among different workstations.
Finally, a Kno can learn from other Knos by receiving operations from them and
incorporating the new operations into its existing ones. A Kno can also forget or delete an
existing operation from its existing ones. In conjunction with this learning ability, a Kno
can manipulate and monitor another Kno's behavior. This allows a Kno to add a new
operation to another Kno, cause the Kno to execute the new operation and then monitor
the Kno. In essence, the controlling Kno can teach and observe the other Kno. It can also
create an instance of itself (self-replicating) and observe the effects of adding new rules
on this child Kno before deciding if it should incorporate the new rule into itself.
5. Conclusion
6. Bibliography
1. Carbonell, Jamie G. and Langley, Pat Machine Learning Tutorial from Seventh
National Conference on Artificial Intelligence, 1988.
2. Carbonell, Tom M., Michalski, Ryszard S. and Mitchell, Tom M., eds. Machine
Learning: Volume 1. Palo Alto, CA: Tioga Pub. Co., 1983.
8
6. Dietterich, Thomas G. "Learning at the Knowledge Level" in Machine Learning
1:287-315. Boston : Kluwer Academic Pub., 1986.
8. Huhns, Michael N., ed. Distributed Artificial Intelligence. Los Altos, CA: Morgan
Kaufmann Pub., Inc., 1987.
11. Mukhopadhyay, Uttam, Stephens, Larry M., Huhns, Michael N. and Bonnell,
Ronald D. "An Intelligent System for Document Retrieval in Distributed Office
Environments" in Readings in Distributed Artificial Intelligence . Bond, Alan H.
and Gasser, Les, eds. San Matie, CA: Morgan Kaufmann Pub., Inc., 1988.
13. Tsichritzis, D., Fiume, E., Gibbs, S. and Nierstrasz O. "KNOs: KNowledge
Acquisition, Dissemination, and Manipulation Objects" in ACM Transactions on
Office Information Systems, Vol. 5 No. 1, January 1987.