0% found this document useful (0 votes)

3 views

ai lecture-4

The document outlines the evolution of game-playing AI across various games such as Checkers, Chess, and Go, highlighting key milestones and achievements. It discusses the formalization of adversarial games, the minimax algorithm, and the efficiency of minimax with alpha-beta pruning. Additionally, it emphasizes the challenges of depth-limited search and the importance of evaluation functions in optimizing game strategies.

Uploaded by

woodhulabe123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

ai lecture-4

Uploaded by

woodhulabe123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 37

Debela Desalegn

April 06, 2025

Game Playing State-of-the-Art
 Checkers:
 1950: First computer player
 1959: Samuel’s self-taught program
 1994: First computer champion: Chinook ended 40-year-
reign of human champion Marion Tinsley using complete 8-
piece endgame.
 2007: Checkers solved!

 Chess:
 1945-1960: Zuse, Wiener, Shannon, Turing,
Newell&Simon, McCarthy
 1960-1996: gradual improvements
 1997: Deep Blue defeats human champion Gary Kasparov
in a six-game match
 2024: Stockfish rating 3631 (vs 2847 for Magnus Carlsen)

 Go:
 1968: Zobrist’s program plays legal Go, barely (b>300!)
 1968-2005: various ad hoc approaches tried, novice level
 2005-2014: Monte Carlo tree search -> strong amateur
 2017-2017: Alphago defeats human world champion
 2022: human exploits NN weakness to defeat top Go
programs
Game Playing State-of-the-Art
 Checkers:
 1950: First computer player
 1959: Samuel’s self-taught program
 1994: First computer champion: Chinook ended 40-year-
reign of human champion Marion Tinsley using complete
8-piece endgame.
 2007: Checkers solved!

 Pacman:
Types of Games

 Zero-Sum Games  General Games

 Agents have opposite utilities (values  Agents have independent utilities (values on
on outcomes) outcomes)
 Lets us think of a single value that one  Cooperation, indifference, competition, and more
maximizes and the other minimizes are all possible
 Adversarial, pure competition  We don’t make AI to act in isolation, it should a) work
around people and b) help people
 That means that every AI agent needs to solve a game
Types of Games
 Many different kinds of games!

 Axes:
 Zero sum?
 Deterministic or stochastic?
 One, two, or more players?
 Perfect information (can you see the state)?

 Want algorithms for calculating a strategy (policy) which

recommends a move from each state --- i.e. not just a sequence of
actions
Adversarial Games:
Deterministic, 2-player, zero-sum, perfect information
Formalization
Our formalization of adversarial games:
 States: S (start at s0)
 Players: P={MAX, MIN}
 Actions: A (may depend on player / state)
 Transition Function: SxA S
 Terminal Test: S {true, false}
 Terminal Utilities: S R (R= ”Reward” = ~score)
MAX maximizes R
MIN minimizes R

Solution for a player is a policy: S A

Single-Agent Trees

2 0 … 2 6 … 4 6
Value of a State
Value of a state: The best Non-Terminal States:
achievable outcome
(utility) from that state

2 0 … 2 6 … 4 6
Terminal States:
Adversarial Game Trees

-20 -8 … -18 -5 … -10 +4 -20 +8

Minimax Values
States Under Agent’s Control: States Under Opponent’s Control:

-8 -5 -10 +8

Terminal States:
Tic-Tac-Toe Game Tree
Adversarial Search (Minimax)
 Deterministic, zero-sum games: Minimax values:
 Tic-tac-toe, chess, checkers computed recursively
 One player maximizes result
5 max
 The other minimizes result

2 5 min
 Minimax search:
 A state-space search tree
 Players alternate turns
8 2 5 6
 Compute each node’s minimax value:
the best achievable utility against a
Terminal values:
rational (optimal) adversary part of the game
Minimax Implementation

def max-value(state): def min-value(state):

initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, min-value(successor)) v = min(v, max-value(successor))
return v return v
Minimax Implementation (Dispatch)
def value(state):
if the state is a terminal state: return the state’s utility
if the next agent is MAX: return max-value(state)
if the next agent is MIN: return min-value(state)

def max-value(state): def min-value(state):

initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, value(successor)) v = min(v, value(successor))
return v return v
Minimax Example

3 12 8 2 4 6 14 5 2
Minimax Properties

max

min

10 10 9 100

Optimal against a perfect player. Otherwise?

Minimax Efficiency

 How efficient is minimax?

 Just like (exhaustive) DFS
 Time: O(bm)
 Space: O(bm)

 Example: For chess, b =35, m=100

 Exact solution is completely infeasible
 But, do we need to explore the whole
tree?
Resource Limits
Game Tree Pruning
Minimax Example

3 12 8 2 4 6 14 5 2
Minimax Pruning

3 12 8 2 14 5 2
Alpha-Beta Pruning
 General configuration (MIN version)
 We’re computing the MIN-VALUE at some node n MAX
 We’re looping over n’s children
 n’s estimate of the childrens’ min is dropping MIN a
 Who cares about n’s value? MAX
 Let a be the best value that MAX can get at any choice
point along the current path from the root
 If n becomes worse than a, MAX will avoid it, so we MAX
can stop considering n’s other children (it’s already
bad enough that it won’t be played) MIN n

 MAX version is symmetric

Alpha-Beta Implementation

α: MAX’s best option on path to root

β: MIN’s best option on path to root

def max-value(state, α, β): def min-value(state , α, β):

initialize v = -∞ initialize v = +∞
for each successor of state: for each successor of state:
v = max(v, value(successor, α, β)) v = min(v, value(successor, α, β))
if v ≥ β return v if v ≤ α return v
α = max(α, v) β = min(β, v)
return v return v
Alpha-Beta Pruning Properties
 This pruning has no effect on minimax value computed for the
root!
 Values of intermediate nodes might be wrong
 Important: children of the root may have the wrong value max
 Important: tie-break for action selection to favor the earlier node
explored
min
 Good child ordering improves effectiveness of pruning

 With “perfect ordering”:

 Time complexity drops to O(bm/2) 10 10 0
 Doubles solvable depth!
 Full search of, e.g. chess, is still
hopeless…
 This is a simple example of metareasoning (computing about what to compute)
Alpha-Beta Quiz
Alpha-Beta Quiz 2
Alpha-Beta Quiz 2

10
<=2

>=100 2
10
Resource Limits
 Problem: In realistic games, cannot search to leaves! 4 max

 Solution: Depth-limited search -2 4 min

 Instead, search only to a limited depth in the tree
-1 -2 4 9
 Replace terminal utilities with an evaluation function
for non-terminal positions

 Example:
 Suppose we have 100 seconds, can explore 10K nodes /
sec
 So can check 1M nodes per move
 - reaches about depth 8 – decent chess program
 Guarantee of optimal play is gone
 More plies makes a BIG difference
? ? ? ?
 Use iterative deepening for an anytime
algorithm
Evaluation Functions
 Evaluation functions score non-terminals in depth-limited search

 Ideal function: returns the actual minimax value of the position

 In practice: typically weighted linear sum of features:

 e.g. f1(s) = (num white queens – num black queens), etc.

Pitfall: Thrashing with Bad Evaluation Function

 A danger of depth-limited search with not-so-great evaluation functions

 Pacman knows his score will go up by eating the dot now (west, east)
 Pacman knows his score will go up just as much by eating the dot later (east, west)
 There are no point-scoring opportunities after eating the dot (within the horizon, two here)
 Therefore, waiting seems just as good as eating: he may go east, then back west in the next
round of replanning!
Depth Matters
 Evaluation functions are
always imperfect
 The deeper in the tree the
evaluation function is buried,
the less the quality of the
evaluation function matters
 An important example of the
tradeoff between complexity
of features and complexity of
computation

\
Iterative Deepening
Iterative deepening using Minimax (or AlphaBeta) as
b
subroutine: Until run out of time: …
1. Do a Minimax up to depth 1, using evaluation function at depth 1
2. Do a Minimax up to depth 2, using evaluation function at depth 2
3. Do a Minimax up to depth 3, using evaluation function at depth 3
4. Do a Minimax up to depth 4, using evaluation function at depth 4
…

When out of time:

Return the result from the deepest search that was fully completed
Synergies between Evaluation Function and Alpha-Beta?

 Alpha-Beta: amount of pruning depends on expansion ordering

 Evaluation function can provide guidance to expand most promising nodes first (which later
makes it more likely there is already a good alternative on the path to the root)
 (somewhat similar to role of A* heuristic, CSPs filtering)

 Alpha-Beta: (similar for roles of min-max swapped)

 Value at a min-node will only keep going down
 Once value of min-node lower than better option for max along path to root, can prune
 Hence: IF evaluation function provides upper-bound on value at min-node, and upper-bound
already
lower than better option for max along path to
root THEN can prune
MiniMiniMax and Emerging Coordination
 Minimax can be extended to more than 2 min
players
 e.g. 2 ghosts and 1 pacman

min
 Result: even though the 2 ghosts independently
…
run their own MiniMiniMax search, they will
naturally coordinate because:
max
 They optimize the same objective …
 They know they optimize the same objective (i.e. min
they know the other ghost is also a minimizer)
…
min

… …
Summary
 Games are decision problems with 2 or more agents
 Huge variety of issues and phenomena depending on details of interactions and payoffs
 For zero-sum games, optimal decisions defined by minimax
 Implementable as a depth-first traversal of the game tree
 Time complexity O(bm), space complexity O(bm)
 Alpha-beta pruning
 Preserves optimal choice at the root
 alpha/beta values keep track of best obtainable values from any max/min nodes on path from root to current node
 Time complexity drops to O(bm/2) with ideal node ordering
 Exact solution is impossible even for “small” games like chess
 Evaluation function
 Iterative deepening (i.e. go as deep as time allows)
 Emergence of coordination:
 For 3 or more agents (all MIN or MAX agents), coordination will naturally emerge from each independently optimizing
their actions through search, as long as they know for each other agent whether they are MIN or MAX

Chess - Garry Kasparov - The Test of Time
93% (15)
Chess - Garry Kasparov - The Test of Time
249 pages
Arena Chess Gui 3.5 Babaschess Bookup Cbreader 1.2: Chess Software by Bill Wall
No ratings yet
Arena Chess Gui 3.5 Babaschess Bookup Cbreader 1.2: Chess Software by Bill Wall
5 pages
6 Game
No ratings yet
6 Game
42 pages
06. Chapter. 06 - Adversarial Search and Games - No Embedded Videos
No ratings yet
06. Chapter. 06 - Adversarial Search and Games - No Embedded Videos
51 pages
Lec7 LU Su20
No ratings yet
Lec7 LU Su20
46 pages
Lecture 6 - minmax alpha beta
No ratings yet
Lecture 6 - minmax alpha beta
41 pages
ITSC6121 Lecture 4 -- Game Trees I
No ratings yet
ITSC6121 Lecture 4 -- Game Trees I
34 pages
cs188 sp23 Lec09
No ratings yet
cs188 sp23 Lec09
47 pages
ITSC6121 Lecture 4 -- Game Trees I
No ratings yet
ITSC6121 Lecture 4 -- Game Trees I
38 pages
06 Minimax
No ratings yet
06 Minimax
53 pages
SP14 CS188 Lecture 6 -- Adversarial Search - print
No ratings yet
SP14 CS188 Lecture 6 -- Adversarial Search - print
31 pages
SP14 CS188 Lecture 6 Adversarial Search
No ratings yet
SP14 CS188 Lecture 6 Adversarial Search
29 pages
6-GAME
No ratings yet
6-GAME
53 pages
CS 188: Artificial Intelligence: Adversarial Search
No ratings yet
CS 188: Artificial Intelligence: Adversarial Search
44 pages
Adversarial Search
No ratings yet
Adversarial Search
91 pages
Lecture 6 - Adversarial Search
No ratings yet
Lecture 6 - Adversarial Search
45 pages
Games
No ratings yet
Games
41 pages
Cs188 Lecture 6 - Adversarial Search - Print (Edx) (2PP)
No ratings yet
Cs188 Lecture 6 - Adversarial Search - Print (Edx) (2PP)
35 pages
Adversarial Search
No ratings yet
Adversarial Search
42 pages
Chapter3 - Search4
No ratings yet
Chapter3 - Search4
37 pages
Week-11 - Adversarial Search
No ratings yet
Week-11 - Adversarial Search
50 pages
Adversarial Search and Game Playing: Games
No ratings yet
Adversarial Search and Game Playing: Games
8 pages
AI Lec07 Adversarial Search
No ratings yet
AI Lec07 Adversarial Search
29 pages
GamePlaying_Minimax_Unit-2_SPS
No ratings yet
GamePlaying_Minimax_Unit-2_SPS
72 pages
AAI Lecture 7 Sp 25
No ratings yet
AAI Lecture 7 Sp 25
51 pages
Lec 04
No ratings yet
Lec 04
79 pages
L06 Adversarial Search
No ratings yet
L06 Adversarial Search
66 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
58 pages
2025-Lecture03-AdversarialSearch
No ratings yet
2025-Lecture03-AdversarialSearch
51 pages
L07 Adversarial Search
No ratings yet
L07 Adversarial Search
48 pages
Adversial Search
No ratings yet
Adversial Search
101 pages
05 Adversarial Search
No ratings yet
05 Adversarial Search
51 pages
cs188-su24-lec06
No ratings yet
cs188-su24-lec06
79 pages
Artificial Inteligence
No ratings yet
Artificial Inteligence
4 pages
AI-Lecture 6 (Adversarial Search)
No ratings yet
AI-Lecture 6 (Adversarial Search)
68 pages
AI Unit-3
No ratings yet
AI Unit-3
109 pages
Lecture05 AdversarialSearch
No ratings yet
Lecture05 AdversarialSearch
51 pages
Game Playing
No ratings yet
Game Playing
53 pages
Game Playing
No ratings yet
Game Playing
24 pages
Game Playing
No ratings yet
Game Playing
33 pages
3 GamePlaying - Minimax
No ratings yet
3 GamePlaying - Minimax
75 pages
CC511 Week 4
No ratings yet
CC511 Week 4
57 pages
CS2201.7
No ratings yet
CS2201.7
56 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
62 pages
Game Playing MINMAX Search, Alpha-Beta Pruning,.pdf
No ratings yet
Game Playing MINMAX Search, Alpha-Beta Pruning,.pdf
4 pages
Lecture 7
No ratings yet
Lecture 7
62 pages
lec04-Adverserial Search
No ratings yet
lec04-Adverserial Search
41 pages
Adversarial Search
No ratings yet
Adversarial Search
20 pages
Adveserial Search
No ratings yet
Adveserial Search
29 pages
Lecture11_AdversarialSearch
No ratings yet
Lecture11_AdversarialSearch
74 pages
1.1.4GamePlaying
No ratings yet
1.1.4GamePlaying
23 pages
Lec3-Adversarial Search
No ratings yet
Lec3-Adversarial Search
73 pages
Lecture 5 - Adversal Search
No ratings yet
Lecture 5 - Adversal Search
88 pages
Oradea: Bucharest Arad Craiova
No ratings yet
Oradea: Bucharest Arad Craiova
53 pages
CSC-411-AI-lec6-Adversarial Search
No ratings yet
CSC-411-AI-lec6-Adversarial Search
38 pages
Game-Playing & Adversarial Search
No ratings yet
Game-Playing & Adversarial Search
68 pages
Adversarial Search
No ratings yet
Adversarial Search
37 pages
Game Playing. Updated (3)
No ratings yet
Game Playing. Updated (3)
44 pages
Adversial Search
No ratings yet
Adversial Search
38 pages
The Virtual Reality Network Elimination Game: A Science Fiction Role Playing Game
From Everand
The Virtual Reality Network Elimination Game: A Science Fiction Role Playing Game
Rik Hunik
No ratings yet
Revenge of the Killer Bs
From Everand
Revenge of the Killer Bs
Migwin Crow
No ratings yet
Fun Online Games For Teens with Tips and Tricks: Ages 13 And Up: Games for Kids and Teens
From Everand
Fun Online Games For Teens with Tips and Tricks: Ages 13 And Up: Games for Kids and Teens
Baby Professor
No ratings yet
Mate in 4
No ratings yet
Mate in 4
19 pages
Leela Chess Zero
No ratings yet
Leela Chess Zero
10 pages
Game Theory Basics 6
No ratings yet
Game Theory Basics 6
32 pages
AIML Mini Project Sr
No ratings yet
AIML Mini Project Sr
3 pages
波尔加习题集5334 1200ppi 超清CB导出习题】
No ratings yet
波尔加习题集5334 1200ppi 超清CB导出习题】
1,592 pages
CS335 Introduction To AI: Francisco Iacobelli June 25, 2015
No ratings yet
CS335 Introduction To AI: Francisco Iacobelli June 25, 2015
49 pages
05 Games
No ratings yet
05 Games
94 pages
Adobe Scan 2025年1月13日 (4)
No ratings yet
Adobe Scan 2025年1月13日 (4)
1 page
Or2 Tugasgt1 21s18052 PDF Free
No ratings yet
Or2 Tugasgt1 21s18052 PDF Free
4 pages
The 16 Top Chess Engine Championship: TCEC16: Reading, UK and Maryland, USA
No ratings yet
The 16 Top Chess Engine Championship: TCEC16: Reading, UK and Maryland, USA
12 pages
Alpha Zero
No ratings yet
Alpha Zero
7 pages
Top Chess Engine Championship: Jump To Navigation Jump To Search
No ratings yet
Top Chess Engine Championship: Jump To Navigation Jump To Search
2 pages
Stockfish Absorbs NNUE, Claims 100 Elo Point Improvement
No ratings yet
Stockfish Absorbs NNUE, Claims 100 Elo Point Improvement
1 page
Yu-Gi-Oh! The Duelists of The Roses (NTSC-U)
No ratings yet
Yu-Gi-Oh! The Duelists of The Roses (NTSC-U)
281 pages
Mate in 2
No ratings yet
Mate in 2
9 pages
16.MCTS Tutorial
No ratings yet
16.MCTS Tutorial
28 pages
SSDF 4 8 2000
No ratings yet
SSDF 4 8 2000
7 pages
Session09 GamePlaying AlphaBeta Short
No ratings yet
Session09 GamePlaying AlphaBeta Short
67 pages
Top Chess Engine Championship
No ratings yet
Top Chess Engine Championship
5 pages
Mate IN 2 Moves-Sol
No ratings yet
Mate IN 2 Moves-Sol
46 pages
Assignment 4
No ratings yet
Assignment 4
2 pages
Chapter3 Adversarial Search Part2
No ratings yet
Chapter3 Adversarial Search Part2
55 pages
ManScie Quiz 1 Midterms Answer Key
No ratings yet
ManScie Quiz 1 Midterms Answer Key
8 pages
2002 Jun MS PDF
No ratings yet
2002 Jun MS PDF
8 pages
Minutero
No ratings yet
Minutero
68 pages
Optimal Searching: - Advantages
No ratings yet
Optimal Searching: - Advantages
37 pages

ai lecture-4

Uploaded by

ai lecture-4

Uploaded by

Debela Desalegn

April 06, 2025

 Zero-Sum Games  General Games

 Want algorithms for calculating a strategy (policy) which

Solution for a player is a policy: S A

-20 -8 … -18 -5 … -10 +4 -20 +8

def max-value(state): def min-value(state):

def max-value(state): def min-value(state):

Optimal against a perfect player. Otherwise?

 How efficient is minimax?

 Example: For chess, b =35, m=100

 MAX version is symmetric

α: MAX’s best option on path to root

def max-value(state, α, β): def min-value(state , α, β):

 With “perfect ordering”:

 Solution: Depth-limited search -2 4 min

 Ideal function: returns the actual minimax value of the position

 e.g. f1(s) = (num white queens – num black queens), etc.

 A danger of depth-limited search with not-so-great evaluation functions

When out of time:

 Alpha-Beta: amount of pruning depends on expansion ordering

 Alpha-Beta: (similar for roles of min-max swapped)

You might also like