02_agents
02_agents
Artificial
Intelligence
Intelligent Agents
AIMA Chapter 2
This work is licensed under a Creative Commons Image: "Robot at the British Library Science Fiction Exhibition"
Attribution-ShareAlike 4.0 International License.
by BadgerGravling
Outline
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
Outline
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
What is an Agents?
• An agent is anything that can be viewed as perceiving its environment
through sensors and acting upon that environment through actuators.
𝒑𝒑
𝒂𝒂 = 𝒇𝒇(𝒑𝒑)
𝑓𝑓 ∶ 𝑃𝑃∗ → 𝐴𝐴
𝒂𝒂
• Sensors
• Memory
• Computational power
Example:
Vacuum-cleaner World
• Percepts:
Location and status,
e.g., [A, Dirty]
• Actions:
Most recent
Left, Right, Suck, NoOp
Percept 𝑝𝑝
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
Rational Agents: What is Good Behavior?
Foundation
• Consequentialism: Evaluate behavior by its consequences.
• Utilitarianism: Maximize happiness and well-being.
This means:
• Rationality is only an ideal
• Rationality ≠ Omniscience (rational agents can make mistakes if percepts and
knowledge do not suffice to make a good decision)
• Rationality ≠ Perfection (rational agents maximize expected outcomes not actual
outcomes)
• It is rational to explore and learn (i.e., use percepts to supplement prior knowledge
and become autonomous)
Example:
Vacuum-cleaner World
• Percepts:
Location and status,
e.g., [A, Dirty]
• Actions:
Left, Right, Suck, NoOp
Agent function: Implemented agent program:
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
Problem Specification: PEAS Performance
measure
Performance
Environment Actuators Sensors
measure
Performance
Environment Actuators Sensors
measure
• Safe • Roads • Steering • Cameras
• fast • other traffic wheel • sonar
• legal • pedestrians • accelerator • speedometer
• comfortable • customers • brake • GPS
trip • signal • Odometer
• maximize • horn • engine
profits sensors
• keyboard
Example: Spam Filter
Performance
Environment Actuators Sensors
measure
• Accuracy: • A user’s email • Mark as spam • Incoming
Minimizing account • delete messages
false • email server • etc. • other
positives, information
false about user’s
negatives account
Outline
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
Environment Types
Fully observable: The agent's sensors
give it access to the complete state of Partially observable: The agent cannot see all
the environment. The agent can “see” vs. aspects of the state. E.g., it can’t see through
the whole environment. walls
Discrete: The environment provides a fixed Continuous: Percepts, actions, state variables or
number of distinct percepts, actions, and vs. time are continuous leading to an infinite state,
environment states. Time can also evolve in a percept or action space.
discrete or continuous fashion.
Single agent: An agent operating by itself in vs. Multi-agent: Agent cooperate or compete in the
an environment. same environment.
Examples of Different Environments
Deterministic Stochastic
Deterministic Strategic Stochastic
+Strategic
Episodic? Episodic Episodic Episodic Sequential
* Can be models as a single agent problem with the other agent(s) in the environment.
Outline
PEAS
(Performance
What is an
measure, Environment
intelligent Rationality Agent types
Environment, types
agent?
Actuators,
Sensors)
Designing a Rational Agent
Remember the definition of a
rational agent:
𝑓𝑓 “For each possible percept sequence, a
action rational agent should select an action
that maximizes its expected
performance measure, given the
evidence provided by the percept
sequence and the agent’s built-in
knowledge.”
Percept to the
Agent Function agent function
• Assess Note: Everything
𝑓𝑓 outside the agent
performance function can be
measure seen as the
• Remember environment.
percept sequence Action from the
• Built-in knowledge agent function
Hierarchy of Agent Types
Utility-based agents
Goal-based agents
𝑎𝑎 = 𝑓𝑓(𝑝𝑝)
The interaction is a sequence: 𝑝𝑝0 , 𝑎𝑎0 , 𝑝𝑝1 , 𝑎𝑎1 , 𝑝𝑝2 , 𝑎𝑎2 , … 𝑝𝑝𝑡𝑡 , 𝑎𝑎𝑡𝑡 , …
Example: A simple vacuum cleaner that uses rules based on its current sensor input.
Model-based Reflex Agent
• Maintains a state variable to keeps track of aspects of the environment that
cannot be currently observed. I.e., it has memory and knows how the
environment reacts to actions.
• The state is updated using the percept.
• There is now more information for the rules to make better decisions.
𝑎𝑎 = 𝑓𝑓(𝑝𝑝, 𝑠𝑠)
The interaction is a sequence: 𝑠𝑠0 , 𝑎𝑎0 , 𝑝𝑝1 , 𝑠𝑠1 , 𝑎𝑎1 , 𝑝𝑝2 , 𝑠𝑠2 , 𝑎𝑎2 , 𝑝𝑝3 , … , 𝑝𝑝𝑡𝑡 , 𝑠𝑠𝑡𝑡 , 𝑎𝑎𝑡𝑡 , …
We often construct atomic labels from factored information. E.g.: If the agent’s state is
the coordinate x = 7 and y = 3, then the atomic state label could be the string “(7, 3)”.
With the atomic representation, we can only compare if two labels are the same. With
the factored state representation, we can reason more and calculate the distance
between states!
State Space: The set of all possible states 𝑆𝑆. This set is typically very large!
Old-school vs. Smart Thermostat
𝑇𝑇
The interaction is a sequence: 𝑠𝑠0 , 𝑎𝑎0 , 𝑝𝑝1 , 𝑠𝑠1 , 𝑎𝑎1 , 𝑝𝑝2 , 𝑠𝑠2 , 𝑎𝑎2 , … , 𝑠𝑠 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔
cost
Example: Solving a puzzle. What action gets me closer to the solution?
Utility-based Agent
• The agent uses a utility function to evaluate the desirability of each possible
states. This is typically expressed as the reward of being in a state 𝑅𝑅(𝑠𝑠).
• Choose actions to stay in desirable states.
• Performance measure: The discounted sum of expected utility over time.
∞
𝑎𝑎 = arg𝑚𝑚𝑚𝑚𝑚𝑚𝑎𝑎0∈A 𝔼𝔼 � 𝛾𝛾 𝑡𝑡 𝑟𝑟𝑡𝑡
𝑡𝑡=0
Expected future
discounted reward
The interaction is a sequence: 𝑠𝑠0 , 𝑎𝑎0 , 𝑝𝑝1 , 𝑠𝑠1 , 𝑎𝑎1 . 𝑝𝑝2 , 𝑠𝑠2 , 𝑎𝑎2 , …
reward
Example: An autonomous Mars rover prefers states where its battery is not critically low.
Agents that Learn
Exploration
Example: Smart Thermostat
Change
temperature
when you are
too
cold/warm.
Smart thermostat
Percepts States
• Temp: deg. F Factored states
• Outside temp. • Estimated
• Weather report time to cool
• Energy the house
curtailment • Someone
• Someone walking home?
by • How long till
• Someone changes someone is
temp. coming
• Day & time home?
• … • A/C: on, off
Example: Modern Vacuum Robot
Features are:
• Control via App
• Cleaning Modes
• Navigation
• Mapping
• Boundary blockers
Source: https://www.techhive.com/article/3269782/best-robot-
vacuum-cleaners.html
PEAS Description of a
Modern Robot Vacuum
Performance
Environment Actuators Sensors
measure
What Type of Intelligent Agent is a
Modern Robot Vacuum?
Performance
Environment Actuators Sensors
measure
How does ChatGPT work?
What Type of Intelligent Agent is
ChatGPT?
Does it collect utility over
Utility-based agents time? How would the utility for
each state be defined?
Is it learning?