Lec7 LU Su20
Lec7 LU Su20
[slides adapted from Nikita Kitaev, Dan Klein, Pieter Abbeel, Anca Dragan, et al of University of California (ai.berkeley.edu).]
Game Playing State-of-the-Art
Checkers: 1950: First computer player. 1994: First
computer champion: Chinook ended 40-year-reign
of human champion Marion Tinsley using complete
8-piece endgame. 2007: Checkers solved!
Chess: 1997: Deep Blue defeats human champion
Gary Kasparov in a six-game match. Deep Blue
examined 200M positions per second, used very
sophisticated evaluation and undisclosed methods
for extending some lines of search up to 40 ply.
Current programs are even better, if less historic.
Go: Human champions are now starting to be
challenged by machines. In go, b > 300! Classic
programs use pattern knowledge bases, but big
recent advances use Monte Carlo (randomized)
expansion methods.
Game Playing State-of-the-Art
Checkers: 1950: First computer player. 1994: First
computer champion: Chinook ended 40-year-reign
of human champion Marion Tinsley using complete
8-piece endgame. 2007: Checkers solved!
Chess: 1997: Deep Blue defeats human champion
Gary Kasparov in a six-game match. Deep Blue
examined 200M positions per second, used very
sophisticated evaluation and undisclosed methods
for extending some lines of search up to 40 ply.
Current programs are even better, if less historic.
Go: 2016: Alpha GO defeats human champion.
Uses Monte Carlo Tree Search, learned evaluation
function.
Pacman
Behavior from Computation
Video of Demo Mystery Pacman
Adversarial Games
Types of Games
Many different kinds of games!
Axes:
Deterministic or stochastic?
One, two, or more players?
Zero sum*?
Perfect information (can you see the state)?
2 0 … 2 6 … 4 6
Value of a State
Value of a state: Non-Terminal States:
The best achievable
outcome (utility)
from that state
2 0 … 2 6 … 4 6
Terminal States:
Adversarial Game Trees
-8 -5 -10 +8
Terminal States:
Tic-Tac-Toe Game Tree
Adversarial Search (Minimax)
Deterministic, zero-sum games: Minimax values:
computed recursively
Tic-tac-toe, chess, checkers
One player maximizes result 5 max
The other minimizes result
2 5 min
Minimax search:
A state-space search tree
Players alternate turns
Compute each node’s minimax value: 8 2 5 6
the best achievable utility against a
rational (optimal) adversary Terminal values:
part of the game
Minimax Implementation
3 12 8 2 4 6 14 5 2
Minimax Properties
max
min
10 10 9 100
3 12 8 2 4 6 14 5 2
Minimax Pruning
3 12 8 2 14 5 2
Alpha-Beta Pruning
General configuration (MIN version)
We’re computing the MIN-VALUE at some node n MAX
We’re looping over n’s children
n’s estimate of the childrens’ min is dropping
MIN a
Who cares about n’s value? MAX
Let a be the best value that MAX can get at any choice
point along the current path from the root
If n becomes worse than a, MAX will avoid it, so we can MAX
stop considering n’s other children (it’s already bad
enough that it won’t be played) MIN n