Dr. Mark Humphrys

School of Computing. Dublin City University.

Home      Blog      Teaching      Research      Contact

Search:

CA216      CA249      CA318

CA400      CA651      CA668


Adversarial Search

Adversarial search

Adversarial game - Chess, business, trading, war.
You change state, but then don't control next state.
Opponent will change state in a way:

  1. unpredictable
  2. hostile to you
You only get to change (say) every alternate state.


4.4.1 Minimax (exhaustive)

Minimax


Example where we can do exhaustive search

The Game of Nim: At each move the player must divide a pile of tokens into 2 piles of different sizes. The first player who is unable to make a move loses the game.


The entire state space for a 7-token game.
Image courtesy of Ralph Morelli.
See Luger Fig 4.19


There are precisely 3 end states, 2 where person who starts wins, 1 where person who went 2nd wins:

Green - Person who starts wins.
Blue - Person who went 2nd wins.


Can exhaustively search the space.
Problem is figuring out what opponent will do.
MINIMAX - We are MAX - trying to maximise our score / move to best state.
Opponent is MIN - tries to minimise our score / move to worst state for us.


If opponent MIN goes first, then:
Green - MIN wins - state has value 0 for us.
Blue - MAX wins - state has value 1.



Propagate these values back up.
Image courtesy of Ralph Morelli.
Note 3-3-1 is missing its value, as we can see on Luger Fig 4.20
Note 5-1-1 has value 0 not 1, because it is MIN's turn, and they will take us to 4-1-1-1 not to 3-2-1-1.
Whereas 6-1 has value 1 not 0 because it is our turn.

If it is MIN's turn, parent value is the worst of its children.
If it is MAX's turn, parent value is the best of its children.

Note 2nd player always wins!
Person who goes first can never win unless 2nd player makes error.
This is different if no. of tokens different to 7. See discussion.




4.4.2 Minimax (to fixed depth)

If cannot do exhaustive search, do minimax to predefined depth ("minimax to fixed ply depth", "n-ply look-ahead")
Leaves of graph here will probably not terminate in a win/loss. So have to give them heuristic estimate of their value.

Heuristic applied to leaves only, not to higher levels.
Higher levels get their values by propagating leaf values upwards.


Image courtesy of Ralph Morelli.
See Luger Fig 4.21
At MAX level, we will always pick best.
At MIN level, opponent will always pick worst for you.
Here, we (MAX) start. From start state, best we can do in 3 plies is get to state with estimated value 3.



Horizon effect: A state may look promising according to the heuristic (e.g. I capture one of opponent's pieces) but actually lead to disaster later on, beyond the horizon.
e.g. Heuristic says leaf is 9. If you expand children of leaf, you find heuristic = 0 for all of them.
(Or, more likely, heuristic = 0 for one of them, but it is opponent's go, so parent must be 0.)

In general, cannot avoid such effects and such imperfect heuristics (a perfect heuristic would be an oracle telling you exactly how to win the game).
One strategy: Do some special extra deep searching for states that look exceptionally good.



4.4.3 Alpha-beta pruning

Alpha-beta pruning

Way of pruning the space.
Rather than search entire space to fixed depth, we can get rid of branches if we know they won't be picked.



Image courtesy of Ralph Morelli.
See Luger Fig 4.26

  1. See state B: It's our turn, so B will have value at least 5. But this is more than 3, so we know at state A, opponent will never pick B. B sub-tree can be pruned.

    Q. So we prune one child of B? So what? (Are all prunings of just one child?)

  2. At D, opponent's turn. They will choose 0 or something worse. But that means at C we will never choose D over A. D tree can be pruned.
  3. Likewise, E will be 2 or worse. So at C we will never choose E over A. E tree can be pruned.


Pruning is good:
Calculating heuristic evaluation of a state takes time (consider chess).
Good if we can avoid evaluating many states.

It depends on the ordering how much can be pruned. If we search B after searching its sibling (of value 3), we can prune it, but not if it happened to come before.
A lucky ordering and you could prune a lot.
An unlucky ordering and you have to visit all states.

Q. Reverse all orderings.
What prunings will then be done?



Feeds      w2mind.org

On Internet since 1987.