X X
X X
X X
X X
X X
X X
X X
X X
X X
X X
X X X
X X X
O O
O O
O O
O O
O O
O O
O O
O O
O O
O O
O O
O
Level 0: O to move
Level 1: X to move
Level 2: O to move
Figure 2.8: Alpha-beta algorithm applied to part of a game of tic-tac-toe
in Figure 2.8? Simple, we continue to search by making each of X’s possible moves and storing each possible board position for level 2. We keep recursively applying
this algorithm until we either reach a maximum search depth, or there is a win, loss, or draw detected in a generated move. We assume that there is a fitness function
available that rates a given board position relative to either side. Note that the value of any board position for X is the negative of the value for O.
To make the search more efficient, we maintain values for alpha and beta for each search level. Alpha and beta determine the best possibleworst possible move avail-
able at a given level. If we reach a situation like the second position in level 2 where X has won, then we can immediately determine that O’s last move in level 1 that
produced this position of allowing X an instant win is a low valued move for O but a high valued move for X. This allows us to immediately “prune” the search
tree by ignoring all other possible positions arising from the first O move in level 1. This alpha-beta cutoff or tree pruning procedure can save a large percentage of
search time, especially if we can set the search order at each level with “probably best” moves considered first.
While tree diagrams as seen in Figure 2.8 quickly get complicated, it is easy for a computer program to generate possible moves, calculate new possible board posi-
tions and temporarily store them, and recursively apply the same procedure to the next search level but switching min-max “sides” in the board evaluation. We will
see in the next section that it only requires about 100 lines of Java code to implement an abstract class framework for handling the details of performing an alpha-beta en-
hanced search. The additional game specific classes for tic-tac-toe require about an additional 150 lines of code to implement; chess requires an additional 450 lines of
code.
23
2.5.2 A Java Framework for Search and Game Playing
The general interface for the Java classes that we will develop in this section was inspired by the Common LISP game-playing framework written by Kevin Knight
and described in Rich, Knight 1991. The abstract class GameSearch contains the code for running a two-player game and performing an alpha-beta search. This class
needs to be sub-classed to provide the eight methods:
public abstract boolean drawnPositionPosition p public abstract boolean wonPositionPosition p,
boolean player positionEvaluationPosition p,
boolean player public abstract void printPositionPosition p
public abstract Position [] possibleMovesPosition p,
boolean player public abstract Position makeMovePosition p,
boolean player, Move move
public abstract boolean reachedMaxDepthPosition p, int depth
public abstract Move getMove The method drawnP osition should return a Boolean true value if the given po-
sition evaluates to a draw situation. The method wonP osition should return a true value if the input position is won for the indicated player. By convention, I
use a Boolean true value to represent the computer and a Boolean false value to represent the human opponent. The method positionEvaluation returns a posi-
tion evaluation for a specified board position and player. Note that if we call po- sitionEvaluation switching the player for the same board position, then the value
returned is the negative of the value calculated for the opposing player. The method possibleM oves returns an array of objects belonging to the class Position. In an
actual game like chess, the position objects will actually belong to a chess-specific refinement of the Position class e.g., for the chess program developed later in this
chapter, the method possibleM oves will return an array of ChessP osition ob- jects. The method makeM ove will return a new position object for a specified
board position, side to move, and move. The method reachedM axDepth returns a Boolean true value if the search process has reached a satisfactory depth. For the
tic-tac-toe program, the method reachedM axDepth does not return true unless ei- ther side has won the game or the board is full; for the chess program, the method
reachedM axDepth returns true if the search has reached a depth of 4 half moves deep this is not the best strategy, but it has the advantage of making the example
24
program short and easy to understand. The method getM ove returns an object of a class derived from the class M ove e.g., T icT acT oeM ove or ChessM ove.
The GameSearch class implements the following methods to perform game search: protected Vector alphaBetaint depth, Position p,
boolean player protected Vector alphaBetaHelperint depth,
Position p, boolean player,
float alpha, float beta
public void playGamePosition startingPosition, boolean humanPlayFirst
The method alphaBeta is simple; it calls the helper method alphaBetaHelper with initial search conditions; the method alphaBetaHelper then calls itself recur-
sively. The code for alphaBeta is:
protected Vector alphaBetaint depth, Position p,
boolean player {
Vector v = alphaBetaHelperdepth, p, player, 1000000.0f,
-1000000.0f; return v;
} It is important to understand what is in the vector returned by the methods alphaBeta
and alphaBetaHelper. The first element is a floating point position evaluation for the point of view of the player whose turn it is to move; the remaining values are the
“best move” for each side to the last search depth. As an example, if I let the tic-tac- toe program play first, it places a marker at square index 0, then I place my marker
in the center of the board an index 4. At this point, to calculate the next computer move, alphaBeta is called and returns the following elements in a vector:
next element: 0.0 next element: [-1,0,0,0,1,0,0,0,0,]
next element: [-1,1,0,0,1,0,0,0,0,] next element: [-1,1,0,0,1,0,0,-1,0,]
next element: [-1,1,0,1,1,0,0,-1,0,] next element: [-1,1,0,1,1,-1,0,-1,0,]
next element: [-1,1,1,1,1,-1,0,-1,0,]
25