DESIRABLE PROPERTIES FOR PARALLEL ALGORITHMS

2.4 DESIRABLE PROPERTIES FOR PARALLEL ALGORITHMS

Before we embark in our study of a parallel algorithm for the selection problem, it may be worthwhile to set ourselves some design goals. A number of criteria were

described in section 1.3 for evaluating parallel algorithms. In light of these criteria, five important properties that we desire a parallel algorithm to possess are now defined.

Number of Processors

The first two properties concern the number of processors to be used by the algorithm. Let n be the size of the problem to be solved:

must be smaller than n: No matter how inexpensive computers become, it is unrealistic when designing a parallel algorithm to assume that we have at our disposal more (or even as many) processors as there are items of data. This is particularly true when n is very large. It is therefore important that

(i)

be expressible as a sublinear function of n, that is,

must be adaptive: In computing in general, and in parallel computing in particular, "appetite comes with eating." The availability of additional computing power always means that larger and more complex problems will be attacked than was possible before. Users of parallel computers will want to push their machines to their limits and beyond. Even if one could afford to have as many processors as data

(ii)

for a particular problem size, it may not be desirable to design an algorithm based on that assumption: A larger problem would render the algorithm totally useless.

44 Selection Chap. 2 Algorithms using a number of processors that is a sublinear function of n [and hence

satisfying property (i)], such as log n or would not be acceptable either due to their inflexibility. What we need are algorithms that possess the "intelligence" to adapt to the actual number of processors available on the computer being used.

2.4.2 Running Time

The next two properties concern the worst-case running time of the parallel algorithm:

must be small: Our primary motive for building parallel computers is to speed up the computation process. It is therefore important that the parallel algorithms we design be fast. To be useful, a parallel algorithm should be significantly faster than the best sequential algorithm for the problem at hand.

(i)

be adaptive: Ideally, one hopes to have an algorithm whose running time decreases as more processors are used. In practice, it is usually the case that a limit is eventually reached beyond which no speedup is possible regardless of the number of processors used. Nevertheless, it is desirable that

(ii) must

vary inversely with

within the bounds set for

2.4.3 Cost

Ultimately, we wish to have parallel algorithms for which

x always matches a known lower bound on the number of sequential operations required in the worst case to solve the problem. In other words, a parallel algorithm should be cost

optimal. In subsequent chapters we shall see that meeting the preceding objectives is usually difficult and sometimes impossible. In particular, when a set of processors are linked by an interconnection network, the geometry of the network often imposes limits on what can be accomplished by a parallel algorithm. It is a different story when the algorithm is to run on a shared-memory parallel computer. Here, it is not at all unreasonable to insist on these properties given how powerful and flexible the model is.

2.6 we describe a parallel algorithm for selecting the kth smallest element of a sequence S =

In section

The algorithm runs on an EREW SM SIMD computer with N processors, where

n. The algorithm enjoys all the desirable properties formulated in this section:

(i) It uses

x 1. The value of is obtained from N =

processors, where

Thus

is sublinear and adaptive.

(ii) It runs in

O(n x ) time, where depends on the number of processors

is smaller than the running time of the optimal sequential algorithm described in

available on the parallel computer. The value of x is obtained in (i). Thus

2.5 Two Useful Procedures 45 section 2.3. It is also adaptive: The larger is

the smaller is and vice versa.

(iii) It has a cost of

which is optimal in view of the lower bound derived in section 2.2.4.

x O(n x ) =

In closing this section we note that all real quantities of the kind just described and n x ) should in practice be rounded to a convenient integer, according to our assumption in chapter 1. When dealing with numbers of processors and running times, though, it is important that this rounding be done pessimistically. Thus, the real

representing the number of processors used by an algorithm should be interpreted as

This is to ensure that the resulting integer does not exceed the actual number of processors. Conversely, the real

representing the worst-case running time of an algorithm should be interpreted as

This guarantees that the resulting integer is not smaller than the true worst-case running time.