0% found this document useful (0 votes)
17 views108 pages

Unit 3_ai_ii Aiml Full-1

This document discusses adversarial search methods in artificial intelligence, focusing on game-playing strategies and concepts such as game theory, minimax algorithm, and alpha-beta pruning. It outlines the elements of game-playing, the structure of game trees, and the importance of optimal decision-making in competitive environments. Additionally, it covers constraint satisfaction problems (CSPs) and their applications in various real-world scenarios.

Uploaded by

felixmathewa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views108 pages

Unit 3_ai_ii Aiml Full-1

This document discusses adversarial search methods in artificial intelligence, focusing on game-playing strategies and concepts such as game theory, minimax algorithm, and alpha-beta pruning. It outlines the elements of game-playing, the structure of game trees, and the importance of optimal decision-making in competitive environments. Additionally, it covers constraint satisfaction problems (CSPs) and their applications in various real-world scenarios.

Uploaded by

felixmathewa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 108

21CSC206T

ARTIFICIAL INTELLIGENCE
UNIT – 3
Department of CSE
Adversarial search Methods-Game
playing-Important concepts
• Adversarial search: Adversarial search is a game-playing
technique where the agents are surrounded by a
competitive environment. A conflicting goal is given to the
agents (multiagent). These agents compete with one
another and try to defeat one another in order to win the
game
• Search based on Game theory; Agents: Competitive
environment
• According to game theory, a game is played between two
players. To complete the game, one has to win the game
and the other looses automatically.’
• Such Conflicting goal- adversarial search
• Game playing technique- Those games- Human Intelligence
and Logic factor- Excluding other factors like Luck factor
– Tic-Tac-Toe, Checkers, Chess – Only mind works, no luck works
Adversarial search Methods-Game
Playing-Important concepts
• Techniques required to get the best
optimal solution (Choose
Algorithms for best optimal solution
within limited time)
– Pruning: A technique which
allows ignoring the unwanted
portions of a search tree which
make no difference in its final
result.
– Heuristic Evaluation Function: It
allows to approximate the cost
value at each level of the search
tree, before reaching the goal
node.

https://www.tutorialandexample.com/adversarial-search-in-artificial- intelligence/#:~:text=AI
%20Adversarial%20search%3A%20Adversarial%20search,order%20to%20win%20the%20game.
Game playing and knowledge structure-
Elements of Game Playing search
• To play a game, we use a game tree to know all • For example, in chess, tic-tac-
the possible choices and to pick the best one out. toe, we have two or three
There are following elements of a game-playing: possible outcomes. Either to
win, to lose, or to draw the
• S0: It is the initial state from where a game begins. match with values +1,-1 or 0.
• PLAYER (s): It defines which player is having the • Game Tree for Tic-Tac-Toe
current turn to make a move in the state. – Node: Game states,
• ACTIONS (s): It defines the set of legal moves to Edges: Moves taken by
be used in a state. players
• RESULT (s, a): It is a transition model
which defines the result of a move.
• TERMINAL-TEST (s): It defines that the game has
ended and returns true.
• UTILITY (s,p): It defines the final value with which
the game has ended. This function is also known
as Objective function or Payoff function. The
price which the winner will get i.e.
• (-1): If the PLAYER loses.
• (+1): If the PLAYER wins.
• (0): If there is a draw between the PLAYERS.
https://www.tutorialandexample.com/adversarial-search-in-artificial-
intelligence/#:~:text=AI%20Adversarial%20search%3A%20Adversarial
Game playing and knowledge structure-
• Elements
INITIAL STATE (S ): Theof
0 topGame Playing search
node in the
game-tree represents the initial state in
the tree and shows all the possible choice
to pick out one.
• PLAYER (s): There are two players, MAX
and MIN. MAX begins the game by
picking one best move and place X in the
empty square box.
• ACTIONS (s): Both the players can make
moves in the empty boxes chance by
chance.
• RESULT (s, a): Ex:The moves made
by MIN and MAX will decide the outcome
of the game.
• TERMINAL-TEST(s): When all the empty
boxes will be filled, it will be the
terminating state of the game.
• UTILITY: At the end, MAX or MIN
Win, and accordingly, the price will be
Game as a search problem
• Types of algorithms in Adversarial search
– In a normal search, we follow a sequence of actions
to reach the goal or to finish the game optimally.
– But in an adversarial search, the result depends on
the players which will decide the result of the game.
– It is also obvious that the solution for the goal state
will be an optimal solution because the player will try
to win the game with the shortest path and under
limited time.
• Minmax Algorithm
• Alpha-beta Pruning
Minimax approach
• Minimax/Minmax/MM/Saddle Point:
– Decision strategy-Game Theory
• Minimize loosing chances- Maximize winning chances
• Two-player game strategy
• Players will be two namely:
• MAX: Increases his chances of winning the game.
• MIN: Decrease the chances of MAX to win the game.
• Result of the game/Utility value
– Heuristic function propagating from initial node to root node
– Backtracking technique-Best choice
Minimax Algorithm
• Follows DFS
– Follows same path cannot change in middle- i.e., Move
once made cannot be altered- That is this is DFS and not
BFS
• Algorithm
– Keep on generating the Complete game tree/ search tree
till a limit d.
– Use the utility of nodes at level n to derive the utility of
nodes at level n-1. (Propagate the values from the leaf
node till the current position following the minimax
strategy).
– Continue backing up values towards the root (one layer at
a time).
– Make the best move from the choices.
Minimax Algorithm
Minimax Algorithm
Example 1: Minimax Algorithm
Example 2: Minimax Algorithm
Example 3: Use the Minimax algorithm to compute
the minimax value at each node for the game tree
below
Use the Minimax algorithm to compute the
minimax value at each node for the game tree
below
Example 4
Minimax Algorithm Example 5
• For example, in the the
players
figure, MAX and MINtwo are
there. MAX starts the game by choosing
one path and propagating all the nodes of
that path. Now, MAX will backtrack to the
initial node and choose the best path
where his utility value will be the
maximum. After
this, its MIN chance. MIN will also
propagate through a path and again will
backtrack, but MIN will choose the path
which could minimize MAX winning
chances or the utility value.
• So, if the level is minimizing, the node will
accept the minimum value from the
successor nodes. If the level is
maximizing, the node will accept the
maximum value from the successor.
• Note: The time complexity of MINIMAX
algorithm is O(bd) where b is the branching
factor and d is the depth of the search
tree.
Alpha beta pruning
• Cutoff the search by exploring less number of nodes
• It makes same moves as Minimax algorithm does-
but prunes unwanted branches using the pruning
techniques
• Alpha beta pruning works on 2 threshold values α and β
– α: It is the best highest value, a MAX player can have. It is the
lower bound, which represents negative infinity value.
– β: It is the best lowest value, a MIN player can have. It is the
upper bound which represents positive infinity.
• So, each MAX node has α-value, which never decreases,
and each MIN node has β-value, which never increases.
• Note: Alpha-beta pruning technique can be applied to trees
of any depth, and it is possible to prune the entire subtrees
easily.
Alpha beta pruning
• Advanced version of MINIMAX algorithm
• Any optimization algorithm- performance
measure is the first consideration
• Drawback of Minimax:
– Explores each node in the tree deeply to provide
the best path among all the paths
– Increases time complexity
• Alpha beta pruning: Overcomes drawback by
less exploring nodes of search tree
• 1. Alpha-beta pruning is a modified version of the min-max algorithm. It
is an optimization technique for the min-max algorithm.
• 2. The number of nodes (game states) that must be examined in the min-
max search algorithm is proportional to the depth of the tree. We cannot
completely eliminate the exponent, but we can reduce it by half.
• 3. There is a technique called pruning that allows us to compute the
correct min-max decision without having to check each node of the
game tree. Because this involves two threshold parameters, alpha and
beta, for future expansion, it is referred to as alpha-beta pruning.
• 4. Alpha-beta pruning can be used at any depth in a tree, and it
sometimes prunes not only the tree leaves but also the entire sub-tree.
• 5. The two parameters can be defined as:
• a. Alpha: The best (highest-value) choice we have found so far at any
point along the path of Maximizer. The initial value of alpha is – ∞.
• b. Beta: The best (lowest-value) choice we have found so far at any point
along the path of Minimizer. The initial value of beta is + ∞.
Algorithm Alpha beta pruning
Alpha beta pruning
Alpha beta pruning
Alpha beta pruning
• Any one player will start the game. Following the DFS order, the player will
choose one path and will reach to its depth, i.e., where he will find
the TERMINAL value.
• If the game is started by player P, he will choose the maximum value in
order to increase its winning chances with maximum utility value.
• If the game is started by player Q, he will choose the minimum value in
order to decrease the winning chances of A with the best possible
minimum utility value.
• Both will play the game alternatively.
• The game will be started from the last level of the game tree, and the
value will be chosen accordingly.
• Like in the below figure, the game is started by player Q. He will pick the
leftmost value of the TERMINAL and fix it for beta (β). Now, the next
TERMINAL value will be compared with the β-value. If the value will be
smaller than or equal to the β-value, replace it with the current β-value
otherwise no need to replace the value.
• After completing one part, move the achieved β-value to its upper node
and fix it for the other threshold value, i.e., α.
• Now, its P turn, he will pick the best maximum value. P will move to
explore the next part only after comparing the values with the current α-
value. If the value is equal or greater than the current α-value, then only it
will be replaced otherwise we will prune the values.
• The steps will be repeated unless the result is not obtained.
• So, number of pruned nodes in the above example are four and MAX wins
the game with the maximum UTILITY value, i.e.,3
• The rule which will be followed is: “Explore nodes if necessary otherwise
prune the unnecessary nodes.”
• Note: It is obvious that the result will have the same UTILITY value that
we may get from the MINIMAX strategy.
Use the Alpha-Beta pruning algorithm to prune the game tree in Problem 1
(a) assuming child nodes are visited from left to right. Show all final alpha and
beta values computed at root, each internal node explored, and at the top of
pruned branches. Note: Follow the algorithm in Figure 5.7 in the textbook
[edition 3]. Also show the pruned branches.
Use the Alpha-Beta pruning algorithm to prune the game tree in Problem 1
(a) assuming child nodes are visited from left to right. Show all final alpha and
beta values computed at root, each internal node explored, and at the top of
pruned branches. Note: Follow the algorithm in Figure 5.7 in the textbook
[edition 3]. Also show the pruned branches.
Game theory problems
• Game theory is basically a branch of mathematics
that is used to typical strategic interaction
between different players (agents), all of which
are equally rational, in a context with predefined
rules (of playing or maneuvering) and outcomes.
• GAME can be defined as a set of players, actions,
strategies, and a final playoff for which all the
players are competing.
• Game Theory has now become a describing
factor for both Machine Learning algorithms and
many daily life situations.

https://www.geeksforgeeks.org/game-theory-in-ai/
Game theory problems
• Types of Games
– Zero-Sum and Non-Zero Sum Games: In non-zero-sum games, there are multiple
players and all of them have the option to gain a benefit due to any move by
another player. In zero-sum games, however, if one player earns something, the
other players are bound to lose a key playoff.
– Simultaneous and Sequential Games: Sequential games are the more popular
games where every player is aware of the movement of another player.
Simultaneous games are more difficult as in them, the players are involved in a
concurrent game. BOARD GAMES are the perfect example of sequential games
and are also referred to as turn-based or extensive-form games.
– Imperfect Information and Perfect Information Games: In a perfect information
game, every player is aware of the movement of the other player and is also
aware of the various strategies that the other player might be applying to win
the ultimate playoff. In imperfect information games, however, no player is
aware of what the other is up to. CARDS are an amazing example of Imperfect
information games while CHESS is the perfect example of a Perfect Information
game.
– Asymmetric and Symmetric Games: Asymmetric games are those win in which
each player has a different and usually conflicting final goal. Symmetric games
are those in which all players have the same ultimate goal but the strategy being
used by each is completely different.
– Co-operative and Non-Co-operative Games: In non-co-operative games, every
player plays for himself while in co-operative games, players form alliances in
order to achieve the final goal.
https://www.geeksforgeeks.org/game-theory-in-ai/
Game theory problems
Examples
• Prisoner’s Dilemma.
• Closed-bag exchange Game,
• The Friend or Foe Game, and
• The iterated Snowdrift Game.

https://www.geeksforgeeks.org/game-theory-in-ai/
Constraint satisfaction problems (CSP)

• Constraint satisfaction problems (CSPs) are mathematical


questions defined as a set of objects whose state must satisfy
a number of constraints or limitations.
• It is a search procedure that operates in a space of
constraints.
• Any problem in the world can mathematically be represented
as CSP.
• The solution is typically a state that can satisfy all the
constraints.

• A constraint satisfaction problem (CSP) consists of


• CSP = { V,D,C}
–a set of variables, V = { V1,V2,V3….}
–a domain for each variable D = { D1,D2,D3….}
–And a set of constraints C = { C1,C2,C3….}
- Dr Faritha Banu / Department of CSE 31
Constraint
• It is mathematical/logical relationship among the
attributes of one or more objects.
• It is important to know the type of constraint.
–Unary Constraint - single variable.
–Binary Constraint - two variable.
–Higher order Constraint - 3 or more variables.
•Constraints can restrict the values of
variables. Example Problems
• Crypt Arithmetic Puzzles
• Map Coloring
• Crossword Puzzles

- Dr Faritha Banu / Department of CSE 32


Examples of CSPs
• Assignment problems
–e.g., who teaches what class
• Timetabling problems
–e.g., which class is offered when and
where?
• Transportation scheduling
• Factory scheduling

Some example of CSP in real world


problems:

- Dr Faritha Banu / Department of CSE 33


Crypt Arithmetic problem
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic
Puzzles
Crypt Arithmetic
Puzzles
Crypt Arithmetic Puzzles
Crypt Arithmetic Puzzles

• Constraints:
1. Variables: can take values from 0-9
2. No two variables should take same value
3. The values should be selected such a way that it
should
comply with arithmetic properties.

- Dr Faritha Banu / Department of CSE 41


- Dr Faritha Banu / Department of CSE 42
- Dr Faritha Banu / Department of CSE 43
- Dr Faritha Banu / Department of CSE 44
- Dr Faritha Banu / Department of CSE 45
- Dr Faritha Banu / Department of CSE 46
- Dr Faritha Banu / Department of CSE 47
- Dr Faritha Banu / Department of CSE 48
Constraint Domain
• It describes different constrainers,
operators, arguments, variables and
their domains.
• It consists of
1. Set of variables (var)
2. Set of all types of functions (f)
3. Legal set of operators (o)
4. Domain variables (dv)
5. Range of variables (rg)
constraint domain is five-tuple and
represented as
D={var,f,o,dv,rg}
- Dr Faritha Banu / Department of CSE 49
Constraint
CSP as a Search Problem
• Initial state:
–{} – all variables are unassigned
• Successor function:
–a value is assigned to one of the unassigned variables
with no conflict
• Goal test:
–a complete assignment
• Path cost:
–a constant cost for each step
• Solution appears at depth n if there are n
variables

- Dr Faritha Banu / Department of CSE 51


Example: The map coloring problem.
• The task of coloring each region red, green or
blue in such a way that no neighboring regions
have the same color.
• To formulate this as CSP, we define the variable to
be the regions: WA, NT, Q, NSW, V, SA, and T.
• The domain of each variable is the set {red,
green, blue}. The constraints require
• neighboring regions to have distinct colors: for
example, the allowable combinations for WA
and NT are the pairs
{(red,green),(red,blue),(green,red),(green,blue),
(blue,red), (blue,green)}.
• (The constraint can also be represented as the
inequality WA ≠NT). There are many
possible solutions, such as
{WA = red, NT = green,
- Dr Faritha Banu Q = red,
/ Department of CSE NSW = green, 52
- Dr Faritha Banu / Department of CSE 53
- Dr Faritha Banu / Department of CSE 54
Constraint Graph
• Constraint Graph: A CSP is usually represented
as an undirected graph, called constraint
graph where the nodes are the variables and
the edges are the binary constraints.
CSP – Backtracking Algorithm
• The backtracking algorithm is a depth-first search algorithm
that methodically investigates the search space of potential
solutions up until a solution is discovered that satisfies all the
restrictions.
• The backtracking algorithm is a popular method for resolving
CSPs. It looks for the search space by picking a variable,
setting a value for it, and then recursively scanning through
the other variables.
• In the event of a conflict, it goes back and tries a different
value for the preceding variable.
CSP – Backtracking Algorithm
• The backtracking algorithm’s essential elements are:
• Variable Ordering: The order in which variables are chosen is
known as variable ordering.
• Value Ordering: The sequence in which values are assigned to
variables is known as value ordering.
• Constraint Propagation: Reducing the domain of variables
based on constraint compliance is known as constraint
propagation.
Improving Backtracking
- Dr Faritha Banu / Department of CSE 60
CSP – Forward Checking
• Forward-checking algorithm: The backtracking technique
has been improved using forward checking.
• It tracks the remaining accurate values of the unassigned
variables after each assignment and reduces the domains
of variables whose values don’t match the assigned
ones. As a result, the search space is smaller, and
constraint propagation is more effectively accomplished.
• IDEA:
• Keep track of remaining legal values for unassigned
variables.
• Terminate the search when any variable has no legal
values.
CSP – Forward Checking

SA can no longer green


CSP – Forward Checking

Now , Forward checking identifies failure. Hence backtracks


for other solution.

https://www.youtube.com/watch?v=R6S7UqkFg8E
Forward Checking
Forward Checking

• To understand the forward checking, we shall see 4


Queens problem.
• If an arrangement on the board of a queen x,
hampers the position of queens x+1, then this
forward check ensures that the queen x should
not be placed at the selected position and a new
position is to be looked upon.

- Dr Faritha Banu / Department of CSE 65


- Dr Faritha Banu / Department of CSE 66
• Q1 and Q2 are placed in row1 and 2 in the left sub-
tree, so, search is halted, since No positions are
left for Q3 and Q4.
• Forward Checking keeps track of the next moves
that are available for the unassigned variables.
• The search will be terminated when there is no
legal move available for the unassigned variables.

Solutions for 4 queen


problem

- Dr Faritha Banu / Department of CSE 67


Intelligent Backtracking

• Conflict set is maintained using forward checking


and maintained.
• Considering the 4 Queens problem,Conict needs to
be detected by the user of conflict set so that
a backtrack can occur
• Backtracking with respect to the conflict set is
called as conflict-directed backjumping
• Backjumping approach can't actually restrict the
earlier committed mistakes in some other
branches

- Dr Faritha Banu / Department of CSE 68


Homework
• Solve the given map coloring problem using Forward checking.
Homework
• Try to assign {Purple, Pink, Yellow} for the given graph below using Intelligent
backtracking in CSP.
Homework
• Color the graph given below using Heuristics in CSP. .
Intelligent Agents
• An agent is anything that can be viewed as perceiving its
environment through sensors and acting upon that
environment through actuators

Human agent:
eyes, ears, and other organs for sensors; hands,
legs, mouth, and other body parts for actuators

• Robotic agent:
cameras and infrared range finders for sensors; various
motors for actuators
Agents and environments

• Percept: agent’s perceptual inputs at an instant

• The agent function maps from percept sequences to


actions:
[f: P*  A]
• The agent program runs on the physical
architecture to produce f

• agent = architecture + program


Vacuum-cleaner world

• Percepts: location and state of the


environment, e.g., [A,Dirty], [B,Clean]

• Actions: Left, Right, Suck, NoOp


Suck

Left
Intelligent agents
• The agent can operate without direct human
intervention or other software methods. It controls
its activities and internal environment. The agent
independently decide which steps it will take in its
current condition to achieve the best improvements.
The agent achieves autonomy if its performance is
measured by its experiences in the context of
learning and adapting.
Rationality
•A rational agent is one that does the right thing. i.e. the table for
the agent function is filled out “correctly.”
•But what does it mean to do the right thing? We use a
performance measure to evaluate any given sequence of
environment states.
Definition of a rational agent:
For each possible precept sequence, a rational agent should select an
action that is expected to maximize its performance measure, given
the evidence provided by the percept sequence and whatever built-in
knowledge the agent possesses.
•Importantly, we emphasize that the performance is assessed in terms of
environment states and not agent states; self-assessment is often
susceptible to self-delusion.
•Here is a relevant rule of thumb: It is advisable to design
performance measures according to what one actually wants in the
environment, as opposed to how one believes that agent should behave.
Rationality :

What is rational at any given time depends on (at least) four


things: (1)The performance measure (Ex. Amount dirt cleaned)
(2) The agent’s prior knowledge (Ex: Room A, try to clean , moves
to
Room B)
(3)The actions the agents can perform (Ex: Agent cannot
clean corners, can clean the center)
(4)The agent’s percept sequence to date. ( Ex: What agent has
learnt over time)
Rationality
• Rationality is distinct from omniscience
• An omniscient agent knows the actual outcome of its actions
and can act accordingly; but omniscience is impossible in
reality.

• Rationality maximizes expected performance not actual


performance, while perfection is maximizing actual performance.

• a rational agent not only to gather information but also to learn as


much as possible from what it perceives (information gathering,
exploration)

• A rational agent should be autonomous—it should learn what it can


to compensate for partial or incorrect prior knowledge. After
sufficient experience of its environment, the behavior of a rational
agent can become effectively independent of its prior knowledge.
PEAS

To design a rational agent, we must specify the task environment


Consider, e.g., the task of designing an automated taxi:

Performance measure: safety, destination, profits,


legality, comfort, ...
Environment : streets/freeways, traffic, pedestrians, weather, ...

Actuators: steering, accelerator, brake, horn, speaker/display, ...


Sensors: video, accelerometers, gauges, engine sensors,
keyboard,
GPS, ...
Performance
• measure
To design a rational agent, we must specify the task environment.
• PEAS stands for performance measure, environment,
actuators, and sensors. PEAS defines AI models and
helps determine the task environment for an
intelligent agent.
• Performance measure: It defines the success of an
agent. It evaluates the criteria that determines
whether the system performs well.
• Environment: It refers to the external context in which
an AI system operates. It encapsulates the physical
and virtual surroundings, including other agents,
objects, and conditions.
• Actuators: They are responsible for executing actions
based on the decisions made. They interact with the
environment to bring about desired changes.
• Sensors: An agent observes and perceives its
Performance
• PEAS measure
Example
Environment types

https://artificialintelligence.readthedocs.io/en/
latest/part1/chap2.html
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types
Environment types

https://www.youtube.com/watch?v=V6lWWaA
Invg
Environment types

https://artificialintelligence.readthedocs.io/en/
latest/part1/chap2.html
task observable determ./ episodic/ static/ discrete/ agents
environm. stochastic sequential dynamic continuous
crossword fully determ. sequential static discrete single
puzzle
chess with fully strategic sequential semi discrete multi
clock
poker partial stochastic sequential static discrete multi

back fully stochastic sequential static discrete multi


gammon
taxi partial stochastic sequential dynamic continuous multi
driving
medical Partial stochastic sequential dynamic continuous single
diagnosis
image fully determ. episodic semi continuous single
analysis
partpicking partial stochastic episodic dynamic continuous single
robot
refinery partial stochastic sequential dynamic continuous single
controller
interact. partial stochastic sequential dynamic discrete multi
Eng. tutor
Agent types
• Five basic types in order of increasing
generality:

• Table Driven agents

• Simple reflex agents

• Model-based reflex agents

• Goal-based agents

• Utility-based agents
Table Driven Agent.
current state of decision process
. (An example for the vacuum world is below).
function Reflex-Vacuum-Agent([location,status]) returns an action
if status = Dirty then return Suck else if location = A then
return Right else if location = B then return Left
•The simple rule based agent. These agents select actions on
the basis of the current percept, ignoring the rest of the percept
history.

NO MEMORY
Fails if environment
is partially
observable

example: vacuum cleaner


world
The main difference between simple reflex agents and model-
based reflex agents is that the latter keep track of the state of the
world.
Model-based reflex agents
description Model the state of the world by:
of
current world modeling how the world
state changes how it’s actions change
the world

• This can work even with partial


information
• It’s is unclear what to
do without a clear goal
Function MODEL-BASED-REFLEX-AGENT (percept)
returns an action
Persistent: state, the agent’s current conception of the
world state
model, a description of how the next state depends on the
current state and action
rules, a set of condition-action rules
action, the most recent action, initially none
state<- UPDATE-STATE (state, action percept,

model) rule <- RULE-MATCH(state, rules)


action <- rule.ACTION
return action
The main difference between model-based reflex agents and goal-
based agents is that it does not act on fixed condition-action rules,
but on some sort of goal information that describes situations that
are desirable (e.g., in the case of route-finding the destination).
Goal-based agents
Goals provide reason to prefer one action over the
other. We need to predict the future: we need to plan &
search
The main difference between goal-based agents and utility-based agents
is that the performance measure is more general. It does not only
consider a binary distinction between “goal achieved” and “goal not
achieved” but allows comparing different world states according to their
relative utility or expected utility, respectively (i.e., how happy the agent
is with the resulting state based on speed and energy consumption in
routing etc).
Utility-based agents
Some solutions to goal states are better than
others. Which one is best is given by a utility
function.
Which combination of goals is preferred?
Learning agents
How does an agent improve over time?
By monitoring it’s performance and suggesting
better modeling, new action rules,
etc.

Evaluates
current
world
state

change
s
action “old agent”=
rules model world
and decide on
actions
suggests to be taken
exploration
s

References
1. Parag Kulkarni, Prachi Joshi, “Artificial Intelligence –Building Intelligent Systems
”PHI learning private Ltd, 2015
• 2.Kevin Night and Elaine Rich, Nair B., “Artificial Intelligence (SIE)”, Mc Graw Hill-
2008.
• 3.Stuart Russel and Peter Norvig “AI – A Modern Approach”, 2nd Edition,
Pearson
Education 2007
• 4. www.javatpoint.com
• 5. www.geeksforgeeks.org
• 6.www.mygreatlearning.com
• 7. www.tutorialspoint.com
https://www.youtube.com/watch?v=cS7Khd0qtVY

• https://www.youtube.com/watch?v=R6S7UqkFg8E

108

You might also like