Other Measures

1.3.4 Other Measures

A digital computer can be viewed as a large collection of interconnected logical gates. These gates are built using transistors, resistors, and capacitors. In today's computers, gates come in packages called chips. These are tiny pieces of semiconductor material used to fabricate logical gates and the wires connecting them. The number of gates on

a chip determines the level of integration being used to build the circuit. One particular technology that appears to be linked to future successes in parallel computing is Very Large Scale Integration (VLSI). Here, nearly a million logical gates can be located on

a single chip. The chip is thus able to house a number of processors, and several such chips may be assembled to build a powerful parallel computer. When evaluating parallel algorithms for VLSI, the following criteria are often used: processor area, wire length, and period of the circuit.

1.3.4.1 Area. If several processors are going to share the "real estate" on a chip, the area needed by the processors and wires connecting them as well as the interconnection geometry determine how many processors the chip will hold. Alternatively, if the number of processors per chip is fixed in advance, then the size of the chip itself is dictated by the total area the processors require. If two algorithms take the same amount of time to solve a problem, then the one occupying less area when implemented as a VLSI circuit is usually preferred. Note that when using the area as a measure of the goodness of a parallel algorithm, we are in fact using the

28 Chap. 1 criterion in section 1.3.2, namely, the number of processors needed by the algorithm.

This is because the area occupied by each processor is normally a constant quantity.

1.3.4.2 Length. This refers to the length of the wires connecting the processors in a given architecture. If the wires have constant length, then it usually means that the architecture is

(i) regular, that is, has a pattern that repeats everywhere, and (ii) modular, that is, can be built of one (or just a few) repeated modules.

With these properties, extension of the design becomes easy, and the size of a parallel computer can be increased by simply adding more modules. The linear and two-dimensional arrays of section 1.2.3.2 enjoy this property. Also, fixed wire length means that the time taken by a signal to propagate from one processor to another is always constant. If, on the other hand, wire length varies from one section of the network to another, then propagation time becomes a function of that length. The tree, perfect

and cube interconnections in section 1.2.3.2 are examples of such networks. Again this measure is not unrelated to the criterion in section 1.3.1, namely, running time, since the duration of a routing step (and hence the algorithm's performance) depends on wire length.

1.3.4.3 Period. Assume that several sets of inputs are available and queued for processing by a circuit in a pipeline fashion. Let A , , A,, ... , A, be a sequence of such inputs such that the time to process A, is the same for all

i n. The period of the circuit is the time elapsed between the moments when processing of

and A,+

begin, which should be the same for all 1 i n. Example 1.20

In example 1.5 several sums were to be computed on a tree-connected SIMD computer. We saw that once the leaves had processed one set of numbers to be added and sent it to their parents for further processing, they were ready to receive the next set. The period of this circuit is therefore 1: One time unit (the time for one addition) separates two

inputs. Evidently, a small period is a desirable property of a parallel algorithm. In

general, the period is significantly smaller than the time required to completely process one input set. In example 1.20, the period is not only significantly smaller than the

n) time units required to compute the sum of n numbers, but also happens to

be constant. We conclude this section with a remark concerning the time taken by a parallel algorithm to receive its input and, once finished computing, to return its output. Our assumption throughout this book is that all the processors of a parallel computer are capable of reading the available input and producing the available output in parallel. Therefore, such simultaneous input or output operations will be regarded as requiring

constant time.

1.4 Expressing Algorithms 29