Given an m-permutation (p, ... the procedure first checks whether it is updatable.
1. Given an m-permutation (p, ... the procedure first checks whether it is updatable.
2. If the permutation is updatable, then its rightmost element is checked first to determine whether it can be incremented; if it can, then the procedure increments it and terminates.
6.3 Generating Permutations in Parallel 151
3. Determining whether p, can be incremented requires scanning no more than m positions of array whose entries indicate which of the integers
... , n}
currently appear in
. .. p,) and which do not. This scanning also yields the
new value of p, in case the latter can be incremented.
4. If the rightmost element cannot be incremented, then the procedure finds the first element to the left of p, that is smaller than its right neighbor. This element,
call it is incremented by the procedure and all elements to its right are updated.
5. Determining the new value of requires scanning no more than m positions of u.
9. Updating all positions to the right of requires scanning no more than the m positions of u.
These observations indicate that the algorithm in section 6.2.1 lends itself quite naturally to parallel implementation. Assume that m processors are available on an EREW SM SIMD computer. We give our first parallel m-permutation generator as procedure PARALLEL PERMUTATIONS. The procedure takes n and m as input and produces all
m-permutations of
.. , n). It assumes that processor has
access to position i of an output register where each successive permutation is produced. There are three arrays in shared memory:
1. p =
... , which stores the current permutation.
2. u
..., where = if i is in the current permutation
... p,);
otherwise =
1. Initially, = 1 for 1 i
n.
3. x =
used to store intermediate results.
Procedure PARALLEL PERMUTATIONS also invokes the following four procedures for EREW SM SIMD computers:
1. Procedure BROADCAST (a, m, x) studied in chapter 2, which uses an array ... , x, to distribute the value of a to m processors
.. . , P,.
2. Procedure
... , x,) also studied in chapter 2, which uses m
processors to compute the prefix sums of the array
... , x, and replace with
for 1
3. Procedure MINIMUM
... , x,) given in what follows, which uses m
processors to find the smallest element in the array
... , x, and return it in :
procedure MINIMUM (x,,
for j = to (log m - 1) do
for i = 1 to m in step of
do in parallel
obtains
through shared memory
end for end for.
1 52 Generating Permutations and Combinations Chap. 6
... , x,), which uses m processors to find largest element in the array
4. Procedure MAXIMUM (x,,
This procedure is identical to procedure MINIMUM, except that step
. . . , x, and return it in
2 now reads
if
then x ,
end if.
5. Procedure PARALLEL SCAN (p,, n), which is helpful in searching for the next available integer to increment a given element p, of an m-permutation
.. . of .. , n). Given p, and n, array u in shared memory is used to
determine which of the m integers p, + 1, + 2,. .. , + m satisfy the two conditions of
(i) being smaller than or equal to n and (ii) being not present in
... p,)
and are therefore available for incrementing p,. Array x in shared memory is used to keep track of these integers.
procedure PARALLEL SCAN (p,, n)
for i
to m do in parallel
else end if
end for.
From chapter 2 we know that procedures BROADCAST and run in
m) time as well. Procedure PARALLEL SCAN takes constant time. We are now ready to state procedure PARALLEL PERMUTATIONS:
m) time. Procedures MINIMUM and MAXIMUM clearly require
procedure PARALLEL PERMUTATIONS (n, m) Step
(1.1) for i = to m do in parallel Pi
(ii) produce
as output
end for
(1.2) {Initialize array u} for i = 1 to
do for j = 1 to m do in parallel
(i) k
(i - l)m j
(ii) if k
then
1 end if
end for end for.
6.3 Generating Permutations in Parallel Step 2:
for t = 1 to
1) do
(2.1) for i = 1 to m do in parallel
end for
(2.2) {Check whether rightmost element of p, ... can be in- cremented;
if there is a j,
< j < n, such that j# for
(i) BROADCAST
m, x)
(ii) PARALLEL SCAN
n)
(2.3) {If several j satisfying the condition in (2.2) are found, the smallest is assigned to p,} (i) {The smallest of the
is found and placed in MINIMUM
x,, ...
if
then (a)
(c) (d) Go to step (2.7)
end if
(2.4) {Rightmost element cannot be incremented; find rightmost element such
(i) for i = 1 to m -1 do in parallel
(ii) {The largest of the is found and placed in
MAXIMUM
(iii) k (iv) BROADCAST (k, m, x)
(v) BROADCAST
m, x)
(2.5) {Increment the smallest available integer larger than is assigned to (i) for i= k to m do in parallel
end for
(ii) PARALLEL SCAN
n)
(iii) MINIMUM
(2.6) {Find the smallest m - k integers that are available and assign their
, , . . . , p,, respectively. This reduces to finding
values to
the first m - k positions of that are equal to
(i)
i=
1 to m do in parallel
(iii) for i = 1 to m do in parallel
then
Generating Permutations and Combinations Chap. 6
(2.7) {Clean up array and output current m-permutation)
(i) for i = 1 to k do in parallel
end for
(ii) for i = 1 to m do in parallel
produce
as output
end for
end for. Analysis.
Step 1 takes time. There are "P, - iterations of step 2, each requiring
time, as can be easily verified. The overall running time of PARALLEL PERMUTATIONS is therefore
m). Since m processors are used, the procedure's cost is
log m).
Example 6.1
We illustrate the working of procedure PARALLEL PERMUTATIONS by showing how a permutation is updated. Let S =
= ( 5 1 4 3) be a 4- permutation to be updated during an iteration of step 2. In step 2.1 array is set up as shown in Fig.
and let
3 is broadcast to all four processors to check whether any of the integers
In step 2.2,
3, and
+ 4 is available. The processors
assign values to array x as shown in Fig. This leads to the discovery in step 2.3 that cannot be incremented. In step 2.4 the processors assign values to array x to indicate the positions of those elements in the permutation that are smaller than their right neighbor, as shown in Fig.
The largest entry in x is determined to be 2; this means that
is to be incremented and all the positions to its right are to be updated. Now 2 and are broadcast to the four processors. In step 2.5 array is updated to indicate that the old values of
The processors now check whether any of the integers
and
are now available, as shown in Fig.
4 is available and indicate their findings by setting up array x as shown in Fig.
+ 1, + 2, + 3, and
The smallest entry in x
is found to be 2: is assigned the value 2 and is set to as shown in Fig. In step 2.6 the smallest two available integers are found by setting array x equal to the first four positions of array u. Now procedure
is applied to array x with the result shown in Fig.
is assigned the value 1. Similarly, since
Since x,
4 2 and
is assigned the value 3. Finally, in step 2.7 positions 2 and 5 of array are set to 1 and the 4-permutation (p,
= 4 - 2 and
= (5 2 1 3) is produced as output.
We conclude this section with two remarks on procedure PARALLEL PERMUTATIONS.
Discussion.
1. The procedure has a cost of log m), which is not optimal in view of the operations sufficient to generate all m-permutations of n items by procedure SEQUENTIAL PERMUTATIONS.
2. The procedure is not adaptive as it requires the presence of m processors in order to function properly. As pointed out earlier, it is usually reasonable to assume that the number of processors on a shared memory parallel computer is not only fixed but also smaller than the size of the typical problem.
6.3 Generating Permutations in Parallel
6.1 Updating permutation using
procedure
PARALLEL PERMUTA-
TIONS.
The preceding remarks lead naturally to the following questions: