0% found this document useful (0 votes)

2 views

05-games

The document discusses adversarial search in AI, particularly focusing on game playing as a reasoning problem that allows for direct comparisons between AI and human players. It outlines the characteristics of strategic games, the structure of game trees, and the minimax algorithm used to determine optimal moves against an opponent. The document also highlights the challenges of computational feasibility in complex games like chess, emphasizing the limitations of the minimax algorithm in practical applications.

Uploaded by

maliksourabh16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

05-games

Uploaded by

maliksourabh16

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 94

Adversarial Search

(Based on slides of Stuart Russell, Henry

Kautz, Linda Shapiro & UW AI Faculty)
1
Game Playing

Why do AI researchers study game playing?

1. It’s a good reasoning problem, formal and

nontrivial.

2. Direct comparison with humans and other

computer programs is easy.

3
What Kinds of Games?
Mainly games of strategy with the following
characteristics:

1. Sequence of moves to play

2. Rules that specify possible moves
3. Rules that specify a payment for each
move
4. Objective is to maximize your payment
4
Games vs. Search Problems

• Unpredictable opponent  specifying a move

for every possible opponent reply

• Time limits  unlikely to find goal, must

approximate

5
Two-Player
Opponent’s Game
Move
Generate New
Position
Gam ye
e s
Over
? n
Generate o
Successors
Evaluate
Successors
Move to Highest-Valued
Successor
no Gam
yes e
Over
? 6
Games as Adversarial Search
• States:
– board configurations
• Initial state:
– the board position and which player will move
• Successor function:
– returns list of (move, state) pairs, each indicating a legal
move and the resulting state
• Terminal test:
– determines when the game is over
• Utility function:
– gives a numeric value in terminal states
(e.g., -1, 0, +1 for loss, tie, win)
7
Game Tree (2-player, Deterministic,
Turns)
computer
’s turn

opponent
’s turn

computer The computer is

’s turn
Max. The
opponent opponent is Min.
’s
turn

leaf nodes At the leaf nodes, the

are utility function
evaluated is employed. Big value
means good, small is
Mini-Max Terminology
• move: a move by both players
• ply: a half-move
• utility function: the function applied to leaf nodes
• backed-up value
– of a max-position: the value of its largest successor
– of a min-position: the value of its smallest successor
• minimax procedure: search down several levels; at
the bottom level apply the utility function, back-up
values all the way up to the root node, and that node
selects the move.
9
Minimax
• Perfect play for deterministic games
• Idea: choose move to position with highest minimax value
= best achievable payoff against best play
• E.g., 2-ply game:

10
80 30 25 5 20 05 40 10 70 50 45 60
35 5 65 15 75 1
1
8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 12
8
0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 13
3
0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 14
3
0

3
0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 15
3
0

3 2
0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 16
3
0

3 2
0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 17
3
0

3
0

3 2
0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 18
3
0

3
0

3 2
0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 19
3
0

3 2
0 0

3 2 2
0 5 0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 20
3
0

3 2
0 0

3 2 2 0
0 5 0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 21
3
0

3 2
0 0

3 2 2 0
0 5 0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 22
3
0

3 2
0 0

3 2 2 0
0 5 0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 23
2
0

2
0

3 2
0 0

3 2 2 0
0 5 0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 24
2
0

2 1
0 5

3 2 1 6
0 0 5 0

3 2 2 0 1 1 4 6
0 5 0 5 0 5 5 0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 25
2
0

2 1
0 5

3 2 1 6
0 0 5 0

3 2 2 0 1 1 4 6
0 5 0 5 0 5 5 0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 26
2
0

2 1
0 5

3 2 1 6
0 0 5 0

3 2 2 0 1 1 4 6
0 5 0 5 0 5 5 0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 27
Minimax Strategy
• Why do we take the min value every other
level of the tree?

• These nodes represent the opponent’s choice

of move.

• The computer assumes that the human will

choose that move that is of least value to the
computer.
28
Minimax algorithm
Adversarial analogue of DFS

29
Properties of Minimax
• Complete?
– Yes (if tree is finite)
• Optimal?
– Yes (against an optimal opponent)
– No (does not exploit opponent weakness against suboptimal opponent)
• Time complexity?
– O(bm)
• Space complexity?
– O(bm) (depth-first exploration)

30
Good Enough?
• Chess:
– branching factor b≈35

– game length m≈100

– search space bm ≈ 35100 ≈ 10154

• The Universe:
– number of atoms ≈ 1078

– age ≈ 1018 seconds

– 108 moves/sec x 1078 x 1018 =

10104

• Exact solution completely infeasible 31

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 32
3
0

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 33
3
0

3 2
0 5

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 34
3
0
Do we need to
check this node?

3 2
0 5

8 3 2 ?? 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 0 5 5 0 0 0 5 0 5 0 5 35
3 No - this branch is guaranteed
0
to be worse than what max
already has
3 2
0 5

80 30 25
5 20 05 40 10 70 50 45 60 75
5 65 15 36
??
3
0

3 2
0 0 Do we need to
check this node?

3 2 2 0
0 5 0 5

80 30 25
5 20 05 ?? 40 10 70 50 45 60 75
5 15 37
35
3
0

3 2
0 0

3 2 2 0
0 5 0 5

80 30 25 20 05
5 40 10 70 50 45 60 75
5 15 38
35 ??
Alpha-Beta
• The alpha-beta procedure can speed up a
depth-first minimax search.
• Alpha: a lower bound on the value that a max
node may ultimately be assigned
v>

• Beta: an upper bound on the value that a

minimizing node may ultimately be assigned
v<
39
Alpha-Beta
MinVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state)){
child =
MaxVal(s,alpha,beta); beta =
min(beta,child);
if (alpha>=beta)
return child;
}
return best child
(min); }

alpha = the highest value for MAX along the path

beta = the lowest value for MIN along the path
40
Alpha-Beta
MaxVal(state, alpha, beta){
if (terminal(state))
return utility(state);
for (s in children(state)){
child =
MinVal(s,alpha,beta); alpha
= max(alpha,child);
if (alpha>=beta)
return child;
}
return best child
(max); }

alpha = the highest value for MAX along the path

beta = the lowest value for MIN along the path
41
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=-
∞
β=∞

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 42
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=-
∞
β=∞

α=-∞ 8
0
β=80

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 43
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=-
∞
β=∞

α=-∞
3
0
β=30

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 44
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=30
β=∞ 30

α=-∞
3
0
β=30

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 45
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=30
β=∞ 30

α=30
β=∞
α=-
∞ 30
β=30

8 3 2 3 5 2 0 6 4 1 7 1 5 4 6 7
0 0 5 5 5 0 5 5 0 0 0 5 0 5 0 5 46
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-
∞
β=∞

α=30
β=∞ 30

β≤α
α=30

α=-
∞
β=30
30
β=25
25
prun
e!

8 3 25 5 2 0 6 4 1 7 1 5 4 6 754
0 0 5 0 5 5 0 0 0 5 0 5 0 7 75
35 47
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 30
β=30

α=30
β=∞ 30

α=30

α=- β=25
∞ 30
β=30 25

8 3 25 5 2 0 6 4 1 7 1 5 4 6 754
0 0 5 0 5 5 0 0 0 5 0 5 0 8 75
35 48
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 30
β=30

α=30 α=-∞
β=∞ 30
β=30

α=30

α=-
2 β=25α=-∞
∞ 30
β=30 5
β=30

8 3 25 5 2 0 6 4 1 7 1 5 4 6 754
0 0 5 0 5 5 0 0 0 5 0 5 0 9 75
35 49
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 30
β=30

α=30 α=20
β=∞ 30 2
β=30 0

α=30 α=20

α=- β=30
2 β=25α=-∞ 2
∞ 30
0
β=30 5
β=20

8 3 25 5 2 0 6 4 1 7 1 5 4 6 755
0 0 5 0 5 5 0 0 0 5 0 5 0 0 75
35 50
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 30
β=30

α=30 α=20
β=∞ 30 2
β=30 0

α=30 α=20

α=- β=05
2 β=25α=-∞ 2
∞ 30
0
β=30 5 05
β=20

8 3 25 5 2 0 6 4 1 7 1 5 4 6 755
0 0 5 0 5 5 0 0 0 5 0 5 0 1 75
35 51
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 30
β=30

α=30 α=20
β=∞ 30 2
β=30 0
β≤α
α=30 α=20

α=-
∞ 30
2 β=25α=-∞ 2
0
β=05
prun
β=30 5 05
β=20 e!

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 2 75
35 52
α=-
α - the best value ∞
for max along the β=∞
path β - the best value
for min along
the path
α=-∞ 20
β=20

α=30 α=20
β=∞ 30 2
β=30 0

α=30 α=20

α=- β=05
2 β=25α=-∞ 2
∞ 30
0 05
β=30 5
β=20

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 3 75
35 53
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=-∞ 20
β=20

α=30 α=20
β=∞ 30 2
β=30 0

α=30 α=20

α=- β=05
2 β=25α=-∞ 2
∞ 30
0 05
β=30 5
β=20

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 4 75
35 54
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2
β=∞
0

α=20
3 2 β=∞
0 0

α=20
3 2 2 0 β=∞
0 5 0 5

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 5 75
35 55
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2
β=∞
0

α=20
3 2 β=∞
0 0

α=20
3 2 2 0
β=10 10
0 5 0 5

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 6 75
35 56
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2
β=∞
0

α=20
3 2 1 β=∞
0 0 0

α=20
3 2 2 0
β=10 10
0 5 0 5

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 7 75
35 57
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2
β=∞
0

α=20
3 2 1 β=∞
0 0 0

α=20
α=20
3 2 2 0 β=15
β=10 10
0 5 0 5 15

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 8 75
35 58
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2
β=∞
0

α=20
3 2 1 β=∞
0 0 5

α=20
α=20
3 2 2 0 β=15
β=10 10
0 5 0 5 15

8 3 25 5 2 05 4 1 7 1 5 4 6 755
0 0 5 0 0 0 0 5 0 5 0 9 75
35 59
α=20
α - the best value 2 β=∞
for max along the 0
path β - the best value
for min along
the path
α=20
2 1
0 5
β=15

α=20
3 2 1 β=∞
0 0 5

α=20
α=20
3 2 2 0 β=15
β=10 10
0 5 0 5 15

8 3 25 5 2 05 4 1 7 1 5 4 6 756
0 0 5 0 0 0 0 5 0 5 0 0 75
35 60
α=20
α - the best value 2 β=∞
for max along the 0 β≤α
path β - the best value
for min along
the path prun
2 1
α=20
e!
0 5
β=15

α=20
3
0
2
0
1
5
β=∞ X
α=20
α=20
3
0
2
5
2
0
0
5
β=10 10
β=15
15
X
X
8 3 25 5 2 05 4 1 7 1 X X X X
0 0 5 0 0 0 0 5 6
35 1
Bad and Good Cases for Alpha-Beta Pruning
• Bad: Worst moves encountered first
4 MAX
+
2 3 MIN
+----+----+
+ +----+----+
6 7 45 3 8 6 4 MAX
+--+
4 +--++ +--+ +-+-+
+----+----+
+--+ +--+ +--+ +--+ +--+--+
6 5 42 3 2 1 1 3 7 4 5 2 3 8 2 1 6 1 2
4
• Good: Good moves ordered first
4 MAX
+
4 3 2 MIN
+ + + + ++ + + + +
4 6 8 3 x x 2 x x MAX
+--+ +--+ +--+ +--++ +-+-+
4 2 6 x 8 x 3 2 1 2
1
• If we can order moves, we can get more benefit from alpha-beta pruning
Properties of α-β
• Pruning does not affect final result. This means that it gets the
exact same result as does full minimax.

• Good move ordering improves effectiveness of pruning

• With "perfect ordering," time complexity = O(bm/2)

 doubles depth of search

• A simple example of reasoning about ‘which computations are

relevant’ (a form of metareasoning)

63
Why O(bm/2)?
Let T(m) be time complexity of search for depth m

Normally:
T(m) = b.T(m-1) + c  T(m) = O(bm)

With ideal α-β pruning:

T(m) = T(m-1) + (b-1)T(m-2) + c  T(m) = O(bm/2)

64
Node Ordering
Iterative deepening search

Use evaluations of the previous search for order

Also helps in returning a move in given time

65
Good Enough?
• Chess: The universe
– branching factor b≈35 can play chess
- can we?
– game length m≈100
– search space bm/2 ≈ 3550 ≈ 1077
• The Universe:
– number of atoms ≈ 1078
– age ≈ 1018 seconds
– 108 moves/sec x 1078 x 1018 = 10104 6
6
Cutting off Search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval

Does it work in practice?

bm = 106, b=35  m=4

4 ply lookahead is a
hopeless chess player!
– 4-ply ≈ human novice
– 8-ply ≈ typical PC,
human master
– 12-ply ≈ Deep Blue, 6
7
Cutof
f

80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
68
0

0 0

0 0 Cutof 0 0
f

80 30 25 35 55 20 05 65 40 10 70 15 50 45 60 75
69
Evaluation Functions
Tic Tac Toe
• Let p be a position in the game
• Define the utility function f(p) by
– f(p) =
• largest positive number if p is a win for computer
• smallest negative number if p is a win for opponent
• RCDC – RCDO
– where RCDC is number of rows, columns and diagonals in
which computer could still win
– and RCDO is number of rows, columns and diagonals in
which opponent could still win.

70
Sample Evaluations
• X = Computer; O = Opponent

O O O X
X X X

X O X

O
rows rows
cols cols
71
diags diags
Evaluation functions
• For chess/checkers, typically linear weighted sum of features
Eval(s) = w1 f1(s) + w2 f2(s) + … + wm fm(s)

e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black
queens), etc.

72
Example: Samuel’s Checker-Playing
Program
• It uses a linear evaluation function
f(n) = w1f1(n) + w2f2(n) + ... + wmfm(n)

For example: f = 6K + 4M + U
– K = King Advantage
– M = Man Advantage
– U = Undenied Mobility Advantage (number of
moves that Max where Min has no jump moves)

73
Samuel’s Checker Player
• In learning mode

– Computer acts as 2 players: A and B

– A adjusts its coefficients after every move
– B uses the static utility function
– If A wins, its function is given to B

74
Samuel’s Checker Player
• How does A change its function?
Coefficent replacement
(node) = backed-up value(node) – initial value(node)
if > 0 then terms that
contributed positively are given more weight and
terms that contributed negatively get less weight
if < 0 then terms that
contributed negatively are given more weight and
terms that contributed positively get less weight

75
Chess: Rich history of cumulative ideas
Minimax search, evaluation function learning
(1950).

Alpha-Beta search (1966).

Transposition Tables (1967).

Iterative deepening DFS

(1975).

End game data bases ,singular extensions(1977,

1980) Parallel search and 7

7
Chess game tree

7
8
Problem with fixed depth
Searches
if we search n moves
only it ahead,
may be possible that the
catastro hy can be delayed
p by a of moves that
sequenc do not progress
e make
any s in other direction
ves may not be
also found)
work
(good
mo

7
9
Problems with a fixed ply: The Horizon Effect

Lose queen Lose pawn

The “look ahead horizon”
Lose queen!!!

• Inevitable losses are postponed

• Unachievable goals appear achievable
• Short-term gains mask unavoidable
consequences (traps)
Solutions
• How to counter the horizon effect
– Feedover
• Do not cut off search at non-quiescent board positions
(dynamic positions)
• Example, king in danger
• Keep searching down that path until reach quiescent
(stable) nodes
– Secondary Search
• Search further down selected path to ensure this is the
best move
Quiescence
Search
This involves searching past the terminal search
nodes (depth of 0) and testing all the non-quiescent
or 'violent' moves until the situation becomes calm,
and only then apply the evaluator.

Enables programs to detect long capture

sequences and calculate whether or not they are
worth initiating.

Expand searches to avoid evaluating a position

where tactical disruption is in progress.

82
Additional Refinements

• Probabilistic Cut: cut branches probabilistically based

on shallow search and global depth-level statistics
(forward pruning)

• Openings/Endgames: for some parts of the game

(especially initial and end moves), keep a catalog of
best moves to make.

• Singular Extensions: find obviously good moves and

try them at cutoff.

83
End-Game Databases

• Ken Thompson - all 5 piece end-games

• Lewis Stiller - all 6 piece end-games
– Refuted common chess wisdom: many
positions thought to be ties were really
forced wins -- 90% for white
– Is perfect chess a win for white?

84
The MONSTER

White wins in 255 moves

(Stiller, 1991)
85
Deterministic Games in Practice
• Checkers: Chinook ended 40-year-reign of human world champion Marion
Tinsley in 1994. Used a precomputed endgame database defining perfect
play for all positions involving 8 or fewer pieces on the board, a total of
444 billion positions. Checkers is now solved!
• Chess: Deep Blue defeated human world champion Garry Kasparov in a
six-game match in 1997. Deep Blue searches 200 million positions per
second, uses very sophisticated evaluation, and undisclosed methods for
extending some lines of search up to 40 ply. Current programs are even
better, if less historic!
• Othello: human champions refuse to compete against computers, who are
too good.
• Go: until recently, human champions refused to compete against
computers, who were too bad. In Go, b > 300, so most programs use
pattern knowledge bases to suggest plausible moves, along with
aggressive pruning. In 2016, DeepMind’s AlphaGo defeated Lee Sedol 4-1
to end the human reign.
86
Game of Go
human champions refused to compete
against computers, because software
used to be too bad.
Chess Go
Size of board 8x8 19 x 19
Average no. 100 300
of moves per
game
Avg 35 235
branching
factor per
turn
Addition Players
al
complexi can 87
AlphaGo (2016)
• Combination of
– Deep Neural Networks
– Monte Carlo Tree Search

• More details later.

88
Other Games
deterministic chance

chess,
perfect backgammo
checkers,
informati n,
go, othello
on monopoly

imperfec bridge,
stratego
t poker,
informati scrabble
on 90
Games of Chance
• What about games that involve chance, such
as
– rolling dice
– picking a card
• Use three kinds of nodes:
– max nodes   min
– min nodes  chanc
– chance nodes e
max
91
Games of Chance
Expectiminimax
c chance node
with max
children
d1 di dk

S(c,di)

expectimax(c) = ∑P(di) max(backed-up-

value(s))
i s in S(c,di)

expectimin(c’) = ∑P(di) min(backed-up- 92

Example Tree with Chance
max

chanc
e . .6
mi 4 
n  1.2
chance
.4

lea .4
3 5 1 4 1 2 4
f 5 .6
max

93
Complexity
• Instead of O(bm), it is O(bmnm) where n is the
number of chance outcomes.

• Since the complexity is higher (both time and

space), we cannot search as deeply.

• Pruning algorithms may be applied.

94
Imperfect Information
• E.g. card games, where
opponents’ initial ar
cards unknown e
• Idea: For all deals consistent with what
you can see
– compute the minimax value of
available actions for each of possible
deals
– compute the expected value over all
deals
95
Status of AI Game Players
• Tic Tac Toe • Poker
– Tied for best player in world – 2015, Heads-up limit hold'em poker
• Othello is solved
– Computer better than any human • Checkers
– Human champions now refuse to – 1994, Chinook ended 40-year reign
play computer of human champion Marion Tinsley
• Scrabble • Chess
– Maven beat world champions Joel – 1997, Deep Blue beat human
Sherman and Matt Graham champion Gary Kasparov in six-
game match
• Backgammon – Deep Blue searches 200M
– 1992, Tesauro combines 3-ply positions/second, up to 40 ply
search & neural networks (with 160 – Now looking at other applications
hidden units) yielding top-3 player
(molecular dynamics, drug
• Bridge synthesis)
– Gib ranked among top players in the • Go
world – 2016, Deepmind’s AlphaGo
defeated Lee Sedol & 2017 defeated
Ke Jie
Summary
• Games are fun to work on!

• They illustrate several important points about AI.

• Perfection is unattainable  must approximate.

• Game playing programs have shown the world what

AI can do.

9
7

Chap04 GamePlaying Complete
No ratings yet
Chap04 GamePlaying Complete
102 pages
Ch5_Adversarial Search_copy
No ratings yet
Ch5_Adversarial Search_copy
67 pages
AI Lecture 5
No ratings yet
AI Lecture 5
94 pages
unit_202_game_playing
No ratings yet
unit_202_game_playing
74 pages
05 Games
No ratings yet
05 Games
94 pages
Unit 202 Game Playing
No ratings yet
Unit 202 Game Playing
74 pages
4 UNIT IV Part 1
No ratings yet
4 UNIT IV Part 1
43 pages
CS2201.7
No ratings yet
CS2201.7
56 pages
Adversarial Search PPT
No ratings yet
Adversarial Search PPT
49 pages
Lecture11_AdversarialSearch
No ratings yet
Lecture11_AdversarialSearch
74 pages
Adversial Search
No ratings yet
Adversial Search
39 pages
Basic 05 Games
No ratings yet
Basic 05 Games
74 pages
AI unit 2 (1)
No ratings yet
AI unit 2 (1)
132 pages
18CS753 Ai Module 4
No ratings yet
18CS753 Ai Module 4
43 pages
Game Playing
No ratings yet
Game Playing
60 pages
UNIT-II
No ratings yet
UNIT-II
56 pages
Games
No ratings yet
Games
41 pages
Lecture 7
No ratings yet
Lecture 7
62 pages
Ch 5 Adversarial Search
No ratings yet
Ch 5 Adversarial Search
20 pages
SET394 - AI - Lecture 06 - Adversarial Search
No ratings yet
SET394 - AI - Lecture 06 - Adversarial Search
27 pages
AI Unit3 Gameplaying
No ratings yet
AI Unit3 Gameplaying
43 pages
Adversarial Search
No ratings yet
Adversarial Search
20 pages
AI unit 2
No ratings yet
AI unit 2
42 pages
Unit 2 MinMaxScaling With Alpha Beta Pruning
No ratings yet
Unit 2 MinMaxScaling With Alpha Beta Pruning
24 pages
Game Playing: MIN-MAX Search
No ratings yet
Game Playing: MIN-MAX Search
6 pages
Lec3-Adversarial Search
No ratings yet
Lec3-Adversarial Search
73 pages
AI Lec07 Adversarial Search
No ratings yet
AI Lec07 Adversarial Search
29 pages
W6-Adverserial Search
No ratings yet
W6-Adverserial Search
39 pages
05 Adversarial Search
No ratings yet
05 Adversarial Search
51 pages
Lec11&12-Adversarial Search
No ratings yet
Lec11&12-Adversarial Search
30 pages
18cs753 Ai Module 4
No ratings yet
18cs753 Ai Module 4
44 pages
AI-UNIT-2 PPT
No ratings yet
AI-UNIT-2 PPT
135 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
58 pages
Adversarial Search
No ratings yet
Adversarial Search
37 pages
Lecture 09 10+Game+Playing+++MinMax-AlphaBeta
No ratings yet
Lecture 09 10+Game+Playing+++MinMax-AlphaBeta
54 pages
Why Do AI Researchers Study Game Playing?
No ratings yet
Why Do AI Researchers Study Game Playing?
42 pages
Game Playing
No ratings yet
Game Playing
53 pages
18cs753 Ai Module 4
No ratings yet
18cs753 Ai Module 4
44 pages
Adversarial Search
No ratings yet
Adversarial Search
42 pages
06. Chapter. 06 - Adversarial Search and Games - No Embedded Videos
No ratings yet
06. Chapter. 06 - Adversarial Search and Games - No Embedded Videos
51 pages
AI UNIT 3 (1)
No ratings yet
AI UNIT 3 (1)
138 pages
Adversarial Search MinMax Alpha Beta Pruning
No ratings yet
Adversarial Search MinMax Alpha Beta Pruning
43 pages
Unit 2- AI
No ratings yet
Unit 2- AI
47 pages
Chapter 3 - Searching-Part 3
No ratings yet
Chapter 3 - Searching-Part 3
64 pages
Unit4_part1pdf__2024_10_07_07_43_26
No ratings yet
Unit4_part1pdf__2024_10_07_07_43_26
18 pages
Game Playing in AI
No ratings yet
Game Playing in AI
12 pages
2025-Lecture03-AdversarialSearch
No ratings yet
2025-Lecture03-AdversarialSearch
51 pages
1 GamePlaying
No ratings yet
1 GamePlaying
30 pages
Adversarial Search
No ratings yet
Adversarial Search
91 pages
Artificial Inteligence
No ratings yet
Artificial Inteligence
4 pages
Unit 2c Game Playing (Compatibility Mode)
No ratings yet
Unit 2c Game Playing (Compatibility Mode)
36 pages
AAI - Intro Lec 9 10
No ratings yet
AAI - Intro Lec 9 10
22 pages
Module 2 (Part 2)
No ratings yet
Module 2 (Part 2)
136 pages
Adversarial Search
No ratings yet
Adversarial Search
36 pages
Unit 5 AI
No ratings yet
Unit 5 AI
80 pages
L06 Adversarial Search
No ratings yet
L06 Adversarial Search
66 pages
21CSC206T Unit3
100% (1)
21CSC206T Unit3
138 pages
Artificial Intelligence: Adversarial Search
No ratings yet
Artificial Intelligence: Adversarial Search
62 pages
Rylie & Kylie Adventures: The Missing Tickets
From Everand
Rylie & Kylie Adventures: The Missing Tickets
Monique Scarver
No ratings yet
The Dark Lord Rises | Horror, Gothic, Dark Wide-Ruled Notebook, Journal, Diary, and/or Log: Record Your Thoughts, Dreams, Reflections, Mood, Notes, Projects, Etc!
From Everand
The Dark Lord Rises | Horror, Gothic, Dark Wide-Ruled Notebook, Journal, Diary, and/or Log: Record Your Thoughts, Dreams, Reflections, Mood, Notes, Projects, Etc!
Naci Sigler
No ratings yet
sheet4
No ratings yet
sheet4
4 pages
Dsa 0020063
No ratings yet
Dsa 0020063
13 pages
Ch2-The Garden of Live Flowers
No ratings yet
Ch2-The Garden of Live Flowers
8 pages
Hex Code 8051
No ratings yet
Hex Code 8051
6 pages
Chess Results List
No ratings yet
Chess Results List
4 pages
Hotel Office: XX XX
No ratings yet
Hotel Office: XX XX
1 page
Protestant Ethic And The Spirit Of Capitalism Weber Max instant download
No ratings yet
Protestant Ethic And The Spirit Of Capitalism Weber Max instant download
41 pages
Safar Book 7 - Year 7B - 2025
No ratings yet
Safar Book 7 - Year 7B - 2025
6 pages
Masetti+Messa - 1001 Chess Exercises For Beginners (2012) Part 1 Mate in One
No ratings yet
Masetti+Messa - 1001 Chess Exercises For Beginners (2012) Part 1 Mate in One
58 pages
Revision - Rules of The Game
No ratings yet
Revision - Rules of The Game
7 pages
Combinatorics - Club 100
No ratings yet
Combinatorics - Club 100
3 pages
Details Layout Section
No ratings yet
Details Layout Section
1 page
Tang Nano 20K 3921 Schematics
No ratings yet
Tang Nano 20K 3921 Schematics
7 pages
Imp Current Affairs 2024-25
No ratings yet
Imp Current Affairs 2024-25
49 pages
Awesomemath Sample Problem Sheet
No ratings yet
Awesomemath Sample Problem Sheet
8 pages
FB 2
No ratings yet
FB 2
1 page
Third Line Samba: G G+5 G G G+5 G
No ratings yet
Third Line Samba: G G+5 G G G+5 G
2 pages
Coborat-A Part Gen in Lucru
No ratings yet
Coborat-A Part Gen in Lucru
4 pages
Dynamics of Chess Strategy - Vlastimil Jansa PDF
50% (8)
Dynamics of Chess Strategy - Vlastimil Jansa PDF
220 pages
x295 MAC TREX with APP
No ratings yet
x295 MAC TREX with APP
36 pages
Knight's Tour - Group 6
No ratings yet
Knight's Tour - Group 6
23 pages
HU Tue Jan 17 2023 23.52.30.noodle
No ratings yet
HU Tue Jan 17 2023 23.52.30.noodle
4 pages
Boeing 777-300ER Air Canada New Livery
No ratings yet
Boeing 777-300ER Air Canada New Livery
18 pages
Amoud University: Faculty of Engineering
No ratings yet
Amoud University: Faculty of Engineering
4 pages
EXHIBITOR LIST AT WINE, F&B EXPO VIETNAM 2024 - Google Sheets
No ratings yet
EXHIBITOR LIST AT WINE, F&B EXPO VIETNAM 2024 - Google Sheets
1 page
Bảng Tra Phân Phối F
No ratings yet
Bảng Tra Phân Phối F
5 pages
SolderingStation2 Schematic v2.6
No ratings yet
SolderingStation2 Schematic v2.6
1 page
Converted Table
No ratings yet
Converted Table
10 pages
Chuyen Anh 16 - 17 - Chinh Thuc
No ratings yet
Chuyen Anh 16 - 17 - Chinh Thuc
9 pages
ChessUp User Manual
No ratings yet
ChessUp User Manual
24 pages

05-games

Uploaded by

05-games

Uploaded by

Adversarial Search

(Based on slides of Stuart Russell, Henry

Why do AI researchers study game playing?

1. It’s a good reasoning problem, formal and

2. Direct comparison with humans and other

1. Sequence of moves to play

• Unpredictable opponent  specifying a move

• Time limits  unlikely to find goal, must

computer The computer is

leaf nodes At the leaf nodes, the

• These nodes represent the opponent’s choice

• The computer assumes that the human will

– game length m≈100

– search space bm ≈ 35100 ≈ 10154

– age ≈ 1018 seconds

– 108 moves/sec x 1078 x 1018 =

• Exact solution completely infeasible 31

• Beta: an upper bound on the value that a

alpha = the highest value for MAX along the path

alpha = the highest value for MAX along the path

• Good move ordering improves effectiveness of pruning

• With "perfect ordering," time complexity = O(bm/2)

• A simple example of reasoning about ‘which computations are

With ideal α-β pruning:

Use evaluations of the previous search for order

Also helps in returning a move in given time

Does it work in practice?

– Computer acts as 2 players: A and B

Alpha-Beta search (1966).

Transposition Tables (1967).

Iterative deepening DFS

End game data bases ,singular extensions(1977,

1980) Parallel search and 7

Lose queen Lose pawn

• Inevitable losses are postponed

Enables programs to detect long capture

Expand searches to avoid evaluating a position

• Probabilistic Cut: cut branches probabilistically based

• Openings/Endgames: for some parts of the game

• Singular Extensions: find obviously good moves and

• Ken Thompson - all 5 piece end-games

White wins in 255 moves

• More details later.

expectimax(c) = ∑P(di) max(backed-up-

expectimin(c’) = ∑P(di) min(backed-up- 92

• Since the complexity is higher (both time and

• Pruning algorithms may be applied.

• They illustrate several important points about AI.

• Perfection is unattainable  must approximate.

• Game playing programs have shown the world what

You might also like