04 Search With Uncertainty
04 Search With Uncertainty
Artificial Intelligence
Search with
Uncertainty
AIMA Chapters 4.3-4.5
Types of No observations:
uncertainty Sensorless problem
we consider
for now* Partially observable environments:
The agent does not know in what
state the environment is.
Exploration:
Unknown environments and
Online search
Solution of the planning phase is a sequence of actions also called a plan that can be
blindly followed: [Suck, Right, Suck]
Consequence of Uncertainty
Solution is typically not a fixed precomputed plan
(sequence of actions), but a
Example transition:
𝑅𝑒𝑠𝑢𝑙𝑡𝑠 𝑠1 , 𝑎 = 𝑠2 , 𝑠4 , 𝑠5
𝑅𝑒𝑠𝑢𝑙𝑡𝑠 1, 𝑆𝑢𝑐𝑘 = 5, 7
Goal states
AND
Right
LOOP: No need to
Suck Solution is shown with bold arrows:
continue search. [Suck, if State = 5 then [Right, Suck] else []]
Solution is the
same as above.
Solution is a subtree that
1. has only GOAL leaf nodes
2. specifies one action at each OR node (state)
3. includes every outcome of AND nodes
AND-OR Tree search: Idea
OR node • Descend the tree by trying an
Suck action in each OR node and
AND node
considering all resulting states
of the AND nodes.
• Remove branches (actions) if
Right we cannot find a subtree
below that leads to only goal
nodes. (see failure in the code
on the next slide). Loop nodes
can be ignored.
Suck
• Stop when we find a subtree
that only has goal states in all
leaf nodes.
• Construct the conditional plan
that represents the subtree
starting at the root node.
[Suck, if State = 5 then [Right, Suck] else []]
AND-OR Recursive DFS Algorithm
= nested If-then-else statements
Notes:
• The DFS search tree is implicitly created using the call stack (recursive algorithm).
• DFS is not optimal! BFS and A* search can be used to find better solutions (e.g., smallest
subtree).
Use of Conditional Plans
• Planning is a goal-based agent.
• The conditional plan can be executed by a model-based reflex agent.
b
Actions to Coerce the ?
World into States
• Actions can reduce the number of possible states.
• Example: Deterministic vacuum world. Agent does not know
its position and the dirt distribution.
Initial belief state {1,2,3,4,5,6,7,8}
right
Goal
states
Actions to Coerce the ?
World into States
• Actions can reduce the number of possible states.
• Example: Deterministic vacuum world. Agent does not know
its position and the dirt distribution.
suck
Actions to Coerce the ?
World into States
• The action sequence [right, suck, left, suck] coerces the
world into the goal state 7. It works from any initial state!
• There are no observations so there is no need for a
conditional plan.
[right,
suck,
left,
suck]
Example: The reachable belief-state ?
space for the deterministic,
sensorless vacuum world
Size of the belief state
space depends on the
number of states 𝑁:
𝒫𝑠 = 2𝑁 = 28 = 256
Initial
belief Only a small fraction
state (12 states) are
reachable.
No observations, so we
get a solution sequence
from an initial belief
state:
[Right, Suck, Left, Suck]
Finding a Solution Sequence
Note: State space size makes this
impractical for larger problems!
Other approach:
• Incremental belief-state search. Generate a solution that works for one state
and check if it also works for all other states. If it does not, then modify the
solution slightly. This is similar to local search.
3m
Case Study
1m
2m
x
Goal The agent can move up, down right, left.
location The agent has no sensors and does not
know its current location.
1. Can you navigate to the goal location?
How?
8m
Agent
2. What would you need to know about
the environment?
𝑃𝑒𝑟𝑐𝑒𝑝𝑡 𝑠 = 𝑇𝑖𝑙𝑒7
• Fully observable: 𝑃𝑒𝑟𝑐𝑒𝑝𝑡 𝑠 = 𝑠
Problem: Many
• Sensorless: 𝑃𝑒𝑟𝑐𝑒𝑝𝑡 𝑠 = 𝑛𝑢𝑙𝑙 states (different
• Partially observable: 𝑃𝑒𝑟𝑐𝑒𝑝𝑡 𝑠 = 𝑜 order of the hidden
𝑜 is called an observation and tells us something about 𝑠 tiles) can produce the
same observation!
Use Observations to Learn
About the State Prediction for
action 𝑎 𝑏 Update with
observation 𝑜
𝑏 = 𝑃𝑟𝑒𝑑𝑖𝑐𝑡 𝑏, 𝑎 = ራ 𝑃𝑟𝑒𝑑𝑖𝑐𝑡(𝑠, 𝑎)
𝑠∈𝑏
Update with observation: You receive an observation 𝑜 and only keep states that are
consistent with the new observation. The belief after observing 𝑜 is:
𝑜 = {𝑠 ∶ 𝑠 ∈ 𝑏 ∧ 𝑃𝑒𝑟𝑐𝑒𝑝𝑡 𝑠 = 𝑜}
𝑏𝑜 = 𝑈𝑝𝑑𝑎𝑡𝑒 𝑏,
[R,Dirty]
?
?
𝑏 ← 𝑈𝑝𝑑𝑎𝑡𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡 𝑏 , 𝑎 , 𝑜
𝑈𝑝𝑑𝑎𝑡𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡 1,3 , 𝑅𝑖𝑔ℎ𝑡 , [𝑅, 𝐷𝑖𝑟𝑡𝑦] = {2}
Solving Partially Observable Problems
Use an AND-OR tree of belief states to create a
conditional plan
Initial
belief state
OR
[L,Clean] [R,Dirty] [R,Clean]
AND
AND
OR
update
OR
update
…
Solution: [Suck, Right, if b = {6} then Suck else []]
Solving Partially Observable Problems
Use an AND-OR tree to create a conditional plan
predict
OR
update
…
Solution: [Suck, Right, if b = {6} then Suck else []]
𝑏 ← 𝑈𝑝𝑑𝑎𝑡𝑒 𝑃𝑟𝑒𝑑𝑖𝑐𝑡 𝑏, 𝑎 , 𝑜
• The agent needs to be able to update its belief state following observations in
real time! For many practical application, there is only time to compute an
approximate belief state! These approximate methods are used in control
theory and reinforcement learning.
Case Study
Partially Observable
8-Puzzle
1. Give a problem description for each step.
• States:
• Initial state:
• Actions:
• Transition model:
• Goal test:
• Percept function:
• The agent uses the transition function to predict the consequence of actions.
What if the transition function is unknown?
• Online search explores the real world one action at a time. Prediction is
replaced by “act” and update by “observe.”