02-intelligent-agents
02-intelligent-agents
Roberto Sebastiani
DISI, Università di Trento, Italy – [email protected]
http://disi.unitn.it/rseba/DIDATTICA/fai_2022/
Copyright notice: Most examples and images displayed in the slides of this course are taken from [Russell & Norwig, “Artificial Intelligence, a Modern Approach”, 3rd ed., Pearson],
including explicitly figures from the above-mentioned book, so that their copyright is detained by the authors. A few other material (text, figures, examples) is authored by (in alphabetical
order): Pieter Abbeel, Bonnie J. Dorr, Anca Dragan, Dan Klein, Nikita Kitaev, Tom Lenaerts, Michela Milano, Dana Nau, Maria Simi, who detain its copyright.
These slides cannot can be displayed in public without the permission of the author.
1 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
2 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
3 / 51
Agents and Environments
Agents
An agent is any entity that can be viewed as:
perceiving its environment through sensors, and
acting upon that environment through actuators.
4 / 51
Agents and Environments
Agents
An agent is any entity that can be viewed as:
perceiving its environment through sensors, and
acting upon that environment through actuators.
4 / 51
Agents and Environments
Agents
An agent is any entity that can be viewed as:
perceiving its environment through sensors, and
acting upon that environment through actuators.
4 / 51
Agents and Environments
Agents
An agent is any entity that can be viewed as:
perceiving its environment through sensors, and
acting upon that environment through actuators.
4 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Agents and Environments [cont.]
Agents
Agents include humans, robots, softbots, thermostats, etc.
human:
perceives: with eyes, ears, nose, hands, ...,
acts: with voice, hands, arms, legs, ...
robot:
perceives: with video-cameras, infra-red sensors, radar, ...
acts: with wheels, motors,
softbot:
perceives: receiving keystrokes, files, network packets, ...
acts: displaying on the screen, writing files, sending network packets
thermostat:
perceives: with heat sensor, ...
acts: electric impulses to valves, devices, ...
5 / 51
Key concepts
Remark
An agent can perceive its own actions, but not always it effects.
6 / 51
Key concepts
Remark
An agent can perceive its own actions, but not always it effects.
6 / 51
Key concepts
Remark
An agent can perceive its own actions, but not always it effects.
6 / 51
Key concepts [cont.]
Agent function
An agent’s behavior is described by the agent function f : P ∗ 7−→ A which maps any given
percept sequence into an action.
ideally, can be seen as a table [percept sequence, action]
Agent program
Internally, the agent function for an artificial agent is implemented by an agent program.
7 / 51
Key concepts [cont.]
Agent function
An agent’s behavior is described by the agent function f : P ∗ 7−→ A which maps any given
percept sequence into an action.
ideally, can be seen as a table [percept sequence, action]
Agent program
Internally, the agent function for an artificial agent is implemented by an agent program.
7 / 51
Key concepts [cont.]
Agent function
An agent’s behavior is described by the agent function f : P ∗ 7−→ A which maps any given
percept sequence into an action.
ideally, can be seen as a table [percept sequence, action]
Agent program
Internally, the agent function for an artificial agent is implemented by an agent program.
7 / 51
Example
A very-simple vacuum cleaner
Environment: squares A and B
Percepts: location ({A, B}) and content ({Dirty , Clean})
e.g. [A, Dirty]
Actions: {left, right, suck , no_op}
8 / 51
Example [cont.]
Note: this agent function depends only on the last percept, not on the whole percept sequence.
9 / 51
Example [cont.]
Note: this agent function depends only on the last percept, not on the whole percept sequence.
9 / 51
Example [cont.]
10 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
11 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents
Main question
What is a rational agent?
12 / 51
Rational Agents [cont.]
13 / 51
Rational Agents [cont.]
13 / 51
Rational Agents: Example
The simple vacuum-cleaner agent
Beware: if a penalty for each move is given, the agent behaves poorly
=⇒ better agent: do nothing once it is sure all the squares are clean
14 / 51
Rational Agents: Example
The simple vacuum-cleaner agent
Beware: if a penalty for each move is given, the agent behaves poorly
=⇒ better agent: do nothing once it is sure all the squares are clean
14 / 51
Rational Agents: Example
The simple vacuum-cleaner agent
Beware: if a penalty for each move is given, the agent behaves poorly
=⇒ better agent: do nothing once it is sure all the squares are clean
14 / 51
Rational Agents: Example
The simple vacuum-cleaner agent
Beware: if a penalty for each move is given, the agent behaves poorly
=⇒ better agent: do nothing once it is sure all the squares are clean
14 / 51
Rationality vs. Omniscience vs. Perfection
Remark
Rationality 6= Omniscience!
An omniscient agent knows for sure the outcome of its actions
=⇒ omniscience impossible in reality
A rational agent may only know “up to a reasonable confidence”
(e.g., when crossing a road, what if something falling from a plane flattens you?
if so, would you be considered irrational?)
Rational behaviour is not perfect behaviour!
perfection maximizes actual performance
(given uncertainty) rationality maximizes expected performance
15 / 51
Rationality vs. Omniscience vs. Perfection
Remark
Rationality 6= Omniscience!
An omniscient agent knows for sure the outcome of its actions
=⇒ omniscience impossible in reality
A rational agent may only know “up to a reasonable confidence”
(e.g., when crossing a road, what if something falling from a plane flattens you?
if so, would you be considered irrational?)
Rational behaviour is not perfect behaviour!
perfection maximizes actual performance
(given uncertainty) rationality maximizes expected performance
15 / 51
Information Gathering, Learning, Autonomy
16 / 51
Information Gathering, Learning, Autonomy
16 / 51
Information Gathering, Learning, Autonomy
16 / 51
Information Gathering, Learning, Autonomy
16 / 51
Information Gathering, Learning, Autonomy
16 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
17 / 51
Task Environments
18 / 51
Task Environments
18 / 51
Task Environments
18 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments [cont.]
Remark
Some goals to be measured may conflict!
e.g. profits vs. safety, profits vs. comfort, ...
=⇒ tradeoffs are required
19 / 51
Task Environments: Examples
20 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
21 / 51
Properties of Task Environments
22 / 51
Properties of Task Environments [cont.]
23 / 51
Properties of Task Environments [cont.]
23 / 51
Properties of Task Environments [cont.]
23 / 51
Properties of Task Environments [cont.]
23 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
24 / 51
Properties of Task Environments [cont.]
Deterministic vs. stochastic
A task environment is deterministic iff its next state is completely determined by its current
state and by the action of the agent. (Ex: a crossword puzzle).
If not so:
A t.e. is stochastic if uncertainty about outcomes is quantified in terms of probabilities
(Ex: dice, poker game, component failure,...)
A t.e. is nondeterministic iff actions are characterized by their possible outcomes, but no
probabilities are attached to them
In a multi-agent environment we ignore uncertainty that arises from the actions of other agents
(Ex: chess is deterministic even though each agent is unable to predict the actions of the others).
25 / 51
Properties of Task Environments [cont.]
Deterministic vs. stochastic
A task environment is deterministic iff its next state is completely determined by its current
state and by the action of the agent. (Ex: a crossword puzzle).
If not so:
A t.e. is stochastic if uncertainty about outcomes is quantified in terms of probabilities
(Ex: dice, poker game, component failure,...)
A t.e. is nondeterministic iff actions are characterized by their possible outcomes, but no
probabilities are attached to them
In a multi-agent environment we ignore uncertainty that arises from the actions of other agents
(Ex: chess is deterministic even though each agent is unable to predict the actions of the others).
25 / 51
Properties of Task Environments [cont.]
Deterministic vs. stochastic
A task environment is deterministic iff its next state is completely determined by its current
state and by the action of the agent. (Ex: a crossword puzzle).
If not so:
A t.e. is stochastic if uncertainty about outcomes is quantified in terms of probabilities
(Ex: dice, poker game, component failure,...)
A t.e. is nondeterministic iff actions are characterized by their possible outcomes, but no
probabilities are attached to them
In a multi-agent environment we ignore uncertainty that arises from the actions of other agents
(Ex: chess is deterministic even though each agent is unable to predict the actions of the others).
25 / 51
Properties of Task Environments [cont.]
Deterministic vs. stochastic
A task environment is deterministic iff its next state is completely determined by its current
state and by the action of the agent. (Ex: a crossword puzzle).
If not so:
A t.e. is stochastic if uncertainty about outcomes is quantified in terms of probabilities
(Ex: dice, poker game, component failure,...)
A t.e. is nondeterministic iff actions are characterized by their possible outcomes, but no
probabilities are attached to them
In a multi-agent environment we ignore uncertainty that arises from the actions of other agents
(Ex: chess is deterministic even though each agent is unable to predict the actions of the others).
25 / 51
Properties of Task Environments [cont.]
Deterministic vs. stochastic
A task environment is deterministic iff its next state is completely determined by its current
state and by the action of the agent. (Ex: a crossword puzzle).
If not so:
A t.e. is stochastic if uncertainty about outcomes is quantified in terms of probabilities
(Ex: dice, poker game, component failure,...)
A t.e. is nondeterministic iff actions are characterized by their possible outcomes, but no
probabilities are attached to them
In a multi-agent environment we ignore uncertainty that arises from the actions of other agents
(Ex: chess is deterministic even though each agent is unable to predict the actions of the others).
25 / 51
Properties of Task Environments [cont.]
26 / 51
Properties of Task Environments [cont.]
26 / 51
Properties of Task Environments [cont.]
26 / 51
Properties of Task Environments [cont.]
27 / 51
Properties of Task Environments [cont.]
27 / 51
Properties of Task Environments [cont.]
27 / 51
Properties of Task Environments [cont.]
27 / 51
Properties of Task Environments [cont.]
28 / 51
Properties of Task Environments [cont.]
Note
The simplest environment is fully observable, single-agent, deterministic, episodic, static
and discrete.
Ex: simple vacuum cleaner
Most real-world situations are partially observable, multi-agent, stochastic, sequential,
dynamic, and continuous.
Ex: taxi driving
29 / 51
Properties of Task Environments [cont.]
Note
The simplest environment is fully observable, single-agent, deterministic, episodic, static
and discrete.
Ex: simple vacuum cleaner
Most real-world situations are partially observable, multi-agent, stochastic, sequential,
dynamic, and continuous.
Ex: taxi driving
29 / 51
Properties of Task Environments [cont.]
Note
The simplest environment is fully observable, single-agent, deterministic, episodic, static
and discrete.
Ex: simple vacuum cleaner
Most real-world situations are partially observable, multi-agent, stochastic, sequential,
dynamic, and continuous.
Ex: taxi driving
29 / 51
Properties of Task Environments [cont.]
Example properties of task Environments
31 / 51
Properties of the Agent’s State of Knowledge
31 / 51
Properties of the Agent’s State of Knowledge
31 / 51
Properties of the Agent’s State of Knowledge
31 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
32 / 51
Agents
Remark
the agent function takes the entire percept history as input
the agent program takes only the current percept as input
=⇒ if the actions need to depend on the entire percept sequence,
then the agent will have to remember the percepts
33 / 51
Agents
Remark
the agent function takes the entire percept history as input
the agent program takes only the current percept as input
=⇒ if the actions need to depend on the entire percept sequence,
then the agent will have to remember the percepts
33 / 51
Agents
Remark
the agent function takes the entire percept history as input
the agent program takes only the current percept as input
=⇒ if the actions need to depend on the entire percept sequence,
then the agent will have to remember the percepts
33 / 51
Agents
Remark
the agent function takes the entire percept history as input
the agent program takes only the current percept as input
=⇒ if the actions need to depend on the entire percept sequence,
then the agent will have to remember the percepts
33 / 51
Agents
Remark
the agent function takes the entire percept history as input
the agent program takes only the current percept as input
=⇒ if the actions need to depend on the entire percept sequence,
then the agent will have to remember the percepts
33 / 51
A trivial Agent Program
34 / 51
A trivial Agent Program
34 / 51
Agent Types
35 / 51
Agent Types
35 / 51
Agent Types: Simple-reflex agent
Select action on the basis of the current percept only
Ex: the simple vacuum-agent
Implemented through condition-action rules
Ex: “if car-in-front-is-braking then initiate-braking”
can be implemented, e.g., in a Boolean circuit
Large reduction in possible percept/action situations due to ignoring the percept history
36 / 51
Agent Types: Simple-reflex agent [cont.]
very simple
may work only if the environment is fully observable
errors, deadlocks or infinite loops may occur otherwise
=⇒ limited applicability
37 / 51
Agent Types: Simple-reflex agent [cont.]
very simple
may work only if the environment is fully observable
errors, deadlocks or infinite loops may occur otherwise
=⇒ limited applicability
37 / 51
Agent Types: Model-based Reflex Agent
Idea: To tackle partially-observable environments,
keeps track of the part of the world it can’t see now
maintain internal state depending on the percept history
reflects at least some of the unobserved aspects of current state
To update internal state the agent needs a model of the world:
how the world evolves independently of the agent
Ex: an overtaking car will soon be closer behind than it was before
how the agent’s own actions affect the world
Ex: turn the steering wheel clockwise =⇒ the car turns to the right
39 / 51
Agent Types: Model-based Goal-based agent
The agent needs goal information describing desirable situation
Ex: destination for a Taxi driver
Idea: combine goal with the model to choose actions
Difficult if long action sequences are required to reach the goal
=⇒ Typically investigated in search and planning research.
Major difference: future is taken into account
rules are simple condition-action pairs, do not target a goal
Goal-based Agents
more flexible:
the knowledge that supports its decisions is represented explicitly
such knowledge can be modified
=⇒ all of the relevant behaviors to be altered to suit the new conditions
Ex: If it rains, the agent can update its knowledge of how effectively its brakes operate
the goal can be modified/updated =⇒ modify its behaviour
no need to rewrite all rules from scratch
more complicate to implement
may require expensive computation (search, planning)
41 / 51
Agent Types: Utility-based agent
Goals alone often not enough to generate high-quality behaviors
Certain goals can be reached in different ways, of different quality
Ex: some routes are quicker, safer, or cheaper than others
Idea: Add utility function(s) to drive the choice of actions
maps a (sequence of) state(s) onto a real number
=⇒ actions are chosen which maximize the utility function
under uncertainty, maximize the expected utility function
=⇒ utility function = internalization of performance measure
Utility-based Agents
advantages wrt. goal-based:
with conflicting goals, utility specifies and appropriate tradeoff
with several goals none of which can be achieved with certainty,
utility selects proper tradeoff between importance of goals and likelihood of success
still complicate to implement
require sophisticated perception, reasoning, and learning
may require expensive computation
43 / 51
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct them)
Advantage: robustness of the agent program toward initially-unknown environments
44 / 51
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct them)
Advantage: robustness of the agent program toward initially-unknown environments
44 / 51
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct them)
Advantage: robustness of the agent program toward initially-unknown environments
44 / 51
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct them)
Advantage: robustness of the agent program toward initially-unknown environments
44 / 51
Agent Types: Learning
Problem
Previous agent programs describe methods for selecting actions
How are these agent programs programmed?
Programming by hand inefficient and ineffective!
Solution: build learning machines and then teach them (rather than instruct them)
Advantage: robustness of the agent program toward initially-unknown environments
44 / 51
Agent Types: Learning
45 / 51
Agent Types: Learning
45 / 51
Agent Types: Learning
45 / 51
Agent Types: Learning
45 / 51
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic observes the shocking
language used by other drivers.
From this experience, the learning element formulates a rule saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in need of improvement, and
suggest trying out the brakes on different road surfaces under different conditions.
46 / 51
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic observes the shocking
language used by other drivers.
From this experience, the learning element formulates a rule saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in need of improvement, and
suggest trying out the brakes on different road surfaces under different conditions.
46 / 51
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic observes the shocking
language used by other drivers.
From this experience, the learning element formulates a rule saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in need of improvement, and
suggest trying out the brakes on different road surfaces under different conditions.
46 / 51
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic observes the shocking
language used by other drivers.
From this experience, the learning element formulates a rule saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in need of improvement, and
suggest trying out the brakes on different road surfaces under different conditions.
46 / 51
Learning Agent Types: Example
Taxi Driving
After the taxi makes a quick left turn across three lanes, the critic observes the shocking
language used by other drivers.
From this experience, the learning element formulates a rule saying this was a bad action.
The performance element is modified by adding the new rule.
The problem generator might identify certain areas of behavior in need of improvement, and
suggest trying out the brakes on different road surfaces under different conditions.
46 / 51
Outline
2 Rational Agents
3 Task Environments
4 Task-Environment Types
5 Agent Types
6 Environment States
47 / 51
Representations
Representations of states and transitions
Three ways to represent states and transitions between them:
atomic: a state is a black box with no internal structure
factored: a state consists of a vector of attribute values
structured: a state includes objects, each of which may have attributes of its own as well as
relationships to other objects
increasing expressive power and computational complexity
reality represented at different levels of abstraction
Atomic Representations
each state of the world is indivisible
no internal structure
state: one among a collection of discrete state values
Ex: find driving routes: {Trento, Rovereto, Verona, ...}
=⇒ only property: be identical to or different from another state
very high level of abstraction
=⇒ lots of details ignored
The algorithms underlying
search and game-playing
hidden Markov models
Markov decision processes
all work with atomic representations (or treat it as such)
49 / 51
Representations [cont.]
Factored Representation
Each state represented in terms of a vector of attribute values
Ex: hzone, {dirty , clean}i, htown, speedi
State: combination of attribute values
Ex: hA, dirty i, hTrento, 40kmhi
Distinct states may share the values of some attribute
Ex: hTrento, 40kmhi and hTrento, 47kmhi
identical iff all attribute have the same values
=⇒ must differ for at least one value to be different
Can represent uncertainty (e.g., ignorance about the amount of gas in the tank represented
by leaving that attribute blank)
Lower level of abstraction =⇒ less details ignored
Many areas of AI based on factored representations
constraint satisfaction and propositional logic
planning
Bayesian networks
(most of) machine learning
50 / 51
Representations [cont.]
Structured Representation
States represents in terms of objects and relations over them
Ex ∀x.(Men(x) → Mortal(x)),
Woman(Maria), Mother ≡ Woman ∩ ∃hasChild.Person
Lowest level of abstraction =⇒ can represent reality in details
Many areas of Ai based on factored representations
relational databases
first-order logic
first-order probability models
knowledge-based learning
natural language understanding
51 / 51