ANALYSIS A N D EXAMPLES

12.5 ANALYSIS A N D EXAMPLES

As it is the case with most MIMD algorithms, the running time of procedure MIMD

ALPHA BETA is best analyzed empirically. In this section we examine two other aspects of the procedure's performance.

1. One of the design objectives stated in section 12.3.3 is to increase the number of cutoffs as much as possible. How does the parallel implementation perform in this respect compared with the sequential version?

2. What amount of shared memory is needed by the algorithm? In answering these two questions, we also present some examples that illustrate the

behavior of procedure MIMD ALPHA BETA.

Parallel Cutoffs

In order to answer the first question, we shall invoke the distinction made in section

12.2 between shallow and deep cutoffs. In the following discussion we use "sequential search" and "parallel search" to refer to the sequential alpha-beta algorithm and procedure MIMD ALPHA BETA, respectively.

Shallow Cutoffs

1. All shallow cutoffs that would occur in a sequential search due to the (temporary) score backed up to a node from its left offspring are also caused by procedure MIMD ALPHA BETA. This is because all (temporary) scores obtained for

the right offspring of the node are compared to the score backed up from its left offspring for a cutoff check before the right

traversal continues. An example illustrating this situation is shown in Fig. 12.11. During stage 1 of the parallel algorithm,

(i) the left of the root is searched exhaustively resulting in the root being assigned (temporarily) the final score of its left offspring

8) and (ii) the two right

are partially searched resulting in temporary scores of 3 and 5 being assigned to the first and second right offspring of the root, respectively.

Traversing Combinatorial Spaces Chap. 12

MAXIMIZING

MINIMIZING

Figure 12.11 Shallow cutoff detected by both sequential search and procedure MIMD ALPHA BETA.

At the beginning of stage 2 it is determined that the circled sections of the two right are cut off in exactly the same way as in sequential traversal.

2 without cutoff compares its final score to the temporary score of the parent and changes the parent's score if necessary. Consequently, any cutoff that would have occurred in other right

A right that is exhaustively searched during stage

due to the score originally backed up to the parent from its left offspring will also occur with the new score backed up to the parent from a right offspring.

2. Some shallow cutoffs that would occur in a sequential search can be missed by procedure MIMD ALPHA BETA due to the way in which processes are generated. In the example of Fig. 12.12, a sequential search would cut off the circled

portion of the tree. Parallel search misses the cutoff since a process is created to search that

before the right of the root completes its search and updates the root's score to 7.

3. Some cutoffs that are missed in a sequential search may occur in procedure MIMD ALPHA BETA due to the way in which processes are generated. A right search that terminates early and causes a change in the parent's score may cause cutoffs in other right

that would not occur in a sequential search. This situation is illustrated in Fig. 12.13, where both right offspring of the root compare their initial scores of 6 and

7, respectively, to the final score of the left offspring, that is,

5. Neither right search is cut off, so processes are generated to continue that search. But since the second right offspring of the root has no further offspring of its own to be examined, its score of 7 is final, and because 7 5, that score is backed up to the root. Now, when the terminal node labeled 8 has been scored and the process at the first right offspring of the root performs a cutoff check before proceeding, this time

a cutoff occurs. The portion of the tree that is cut off is shown circled in Fig. 12.13; this portion is not cut off during a sequential search.

12.5 Analysis and Examples

missed by procedure MIMD ALPHA BETA.

Shallow cutoff missed in sequential search and discovered by pro- cedure MIMD ALPHA BETA.

Figure

Deep Cutoffs. In order for deep cutoffs to occur at a node, scores from searches of other parts of the tree must be available. In a sequential search the scores at each ply are known to every node and are stored in a single global score table. In procedure MIMD ALPHA BETA this is impossible, as stated in the previous section. We now show briefly why this is the case. Assume that a single global score table was used. In Fig.

nodes 1 and 2 are scored simultaneously. Suppose that node 2 receives its score first, as shown in Fig.

This means that the right offspring of the root is backed up the score 9 at ply 1 and then the left offspring is backed up the score 6 (overwriting the score table value of 9 at ply 1). Now when node 3 is scored, the

value 8 will not be recorded in the table at ply 1 (since 8 6 and we are minimizing at ply 1). Therefore, the value of 8 will not be backed up to the root as it would be in the

Traversing

Spaces Chap. 12

(b) INITIALLY

AFTER NODE

(d) AFTER NODE (e) AFTER NODE

2 IS SCORED

1 IS SCORED

3 IS SCORED

Figure 12.14 Using single score table in parallel search leads to incorrect results.

sequential search. As a result, the best sequence of moves from the root, namely, (m,, m,), is not returned; instead (m,, m,) is returned.

We conclude from the discussion that having a single score table is impossible in parallel search as it would lead to incorrect results. The alternative adopted by procedure MIMD ALPHA BETA is to assign to each node created its own score table; this, however, means that the information necessary for a deep cutoff to occur is not available in general, as shown in the following example.

Example 12.2

Figure

12.15 illustrates a deep cutoff occurring in a sequential search: The circled portion is cut off due to the score of the root's left

being available in the score table, while the root's right

is searched.

This deep cutoff cannot occur in procedure MIMD ALPHA BETA, as shown in Fig.

12.16: Each node of the right has a score table initialized to the score table of its parent and not containing the score of the root's left offspring.

12.5 Analysis and Examples

PLY 0 PLY 1 PLY 2 PLY 3 PLY 4

(b) INITIALLY

AFTER NODE AFTER NODE IS SCORED

IS SCORED

Figure 12.15 Deep cutoff in sequential search.

12.5.2 Storage Requirements

This section presents an analysis of the storage requirements of procedure MIMD ALPHA BETA. We begin by deriving an upper bound on the amount of storage needed by the procedure under the assumption that an infinite number of processors is available. A more realistic estimate of the storage requirements is then derived by fixing the number of processors used during the search.

Unlimited Processors. Recall that the procedure makes a crucial dist- inction between the

offspring of a node and the remaining offspring of that node. During stage 1, knowledge about the behavior of the sequential version is used to explore several paths in parallel. During each iteration of stage 2, several are searched in parallel, each

however, being searched sequentially. This is

illustrated in Figs. 12.17 and 12.18.

In Fig. 12.17 a uniform tree is shown whose depth and fan-out are both equal to

3. The paths explored in parallel during stage 1 are indicated by bold lines. Calling the

Figure 12.17

traversed during stage 1 and first iteration of stage

traversed during second iteration of stage 2.

334 Traversing Combinatorial Spaces Chap. 12 root a left node, it is clear that

are given priority by the procedure. Nodes explored during stage 1 will therefore be known as primary nodes, that is, nodes at which a process is created during stage 1 to do the search. Formally:

and their right

1. The root is a primary left offspring,

2. a primary left offspring at ply k is the left offspring of a primary (left or right) offspring at ply k -

1, and

3. a primary right offspring at ply k is a right offspring of a primary left offspring at ply k - 1.

Following stage 1, the temporary score backed up at node 1 is compared with the ones at nodes 2 and 3; if the former is smaller, then the unexplored portions of the rooted at 2 and 3 need not be considered at all. Otherwise, one or both of these two portions, shown circled in Fig. 12.17, are searched simultaneously (each sequentially) during the first iteration of stage 2.

When the rooted at nodes 2 and 3 have been fully searched, the final score backed up at node 1 is compared with the temporary scores at nodes 4 and 5 for

a cutoff. If the former is larger, the cutoff check is successful and the unexplored

of 4 and 5 need not be considered. Otherwise, one or both of the shown circled in Fig. 12.18 are searched simultaneously (each sequentially) during the second iteration of stage 2, and so on.

To study the storage requirements of the procedure, we note that for every node being explored during the search at least one storage location is needed to hold the temporary score of that node. When an explored node is discarded from further

consideration, its storage locations are reallocated to another unexplored node that the procedure decides to examine. Therefore, in order to determine how much storage is needed, it is necessary to derive the maximum number of nodes simultaneously explored at any time during the search. This number is precisely the number of primary nodes (during stage 1 where the maximum degree of parallelism occurs).

To see this, note that any tree searched sequentially during stage 2 is rooted at a node that was primary, that is, explored during stage 1. This

is isomorphic to the

rooted at the same primary node. The has at least as many primary nodes as a

searched in stage 2. Therefore, the number of nodes searched in parallel during stage 2 cannot exceed the number of primary nodes. This latter number is now derived (keeping in mind that an infinite number of processors is available and therefore no bound exists on the number of processes to be created). Let

= number of primary left offspring at ply k and = number of primary right offspring at ply k.

12.5 Analysis and Examples 335

In Fig. 12.16, = 5 and R(3) =

6. From our definition of primary nodes it follows that for a uniform tree with fan-out we have

- 1) + R(k -

For a uniform tree of depth d, the total number of primary nodes is therefore given by

and the storage requirements of the algorithm are clearly of Solving the preceding recurrence, we get

and

where

Limited Processors. It is already clear that our assumption about the availability of an unlimited number of processors can now be somewhat relaxed. Indeed, the maximum number of processors the algorithm will ever need to search a

uniform tree of depth d will be

In Fig. 12.16,

d) establishes an upper bound on the number of processors that will ever be needed by the algorithm to search a uniform tree, it is still a very large number of order

d) =

11. Even though

as one should have expected. In practice, however, only a small number of processors is available and we are led to reconsider our definition of primary nodes. The actual number of primary nodes is in fact determined by the number of processors available. If N processors are used to search a uniform tree of fan-out then the actual number of primary nodes at level k is equal to

+ N},

and the total number of primary nodes for a tree of depth d is given by the function

+ N}.

Spaces Chap. 12 U n d e r these conditions the storage requirements of the algorithm are clearly of

Traversing

Note that S =

d)), a n d that for N

we have

= 1 + Nd.