§1.2 Algorithm Details 16
procedure mergelist dest, list source1, list source2 while source1 not empty and source2 not empty
if top_of_source1 top_of_source_2 put top_of_source1 into dest
else put top_of_source2 into dest
endif endwhile
while source1 not empty put top_of_source1 into dest
endwhile while source2 not empty
put top_of_source2 into dest endwhile
end
Figure 1.4 : Pseudo-code for a simple merge
does not require this destination array. In order to achieve this, it is clear that the algorithm must re-use the space that is
freed by moving elements from the two source lists. We now describe how this can be done. The algorithm has several parts, each of which is described separately.
The principle of infinity padding is used to determine how many elements will be required in each of the nodes at the completion of the merge operation. If the
complete balance operation has been performed at the start of the whole algorithm then the result of this operation must be that the nodes end up with the same number
of elements after the merge-exchange as before. We assume that infinity padding tells us that we require N
′ 1
and N
′ 2
elements to be in nodes p
1
and p
2
respectively after the merge.
1.2.9 Find-Exact Algorithm
When a node takes part in a merge-exchange with another node, it will need to be able to access the other node’s elements as well as its own. The simplest method for
doing this is for each node to receive a copy of all of the other node’s elements before the merge begins.
A much better approach is to first determine exactly how many elements from each
§1.2 Algorithm Details 17
node will be required to complete the merge, and to transfer only those elements. This reduces the communications cost by minimizing the number of elements transferred,
and at the same time reduces the memory overhead of the merge. The find-exact algorithm allows each node to determine exactly how many ele-
ments are required from another node in order to produce the correct number of ele- ments in a merged list.
When a comparison is made between element E
1 ,A−1
and E
2 ,N
′ 1
−A
then the result of the comparison determines whether node p
1
will require more or less than A of its own elements in the merge. If E
1 ,A−1
is greater than E
2 ,N
′ 1
−A
then the maximum number of elements that could be required to be kept by node p
1
is A − 1, otherwise
the minimum number of elements that could be required to be kept by node p
1
is A. The proof that this is correct relies on counting the number of elements that could
be less than E
1 ,A−1
. If E
1 ,A−1
is greater than E
2 ,N
′ 1
−A
then we know that there are at least N
′ 1
− A + 1 elements in node p
2
that are less than E
1 ,A−1
. If these are combined with the A
− 1 elements in node p
1
that are less than E
1 ,A−1
, then we have at least N
′ 1
elements less than E
1 ,A−1
. This means that the number of elements that must be kept by node p
1
must be at most A − 1.
A similar argument can be used to show that if E
1 ,A−1
≤ E
2 ,N
′ 1
−A
then the number of elements to be kept by node p
1
must be at least A. Combining these two results leads to an algorithm that can find the exact number of elements required in at most log N
1
steps by successively halving the range of possible values for the number of elements required to be kept by node p
1
. Once this result is determined it is a simple matter to derive from this the number
of elements that must be sent from node p
1
to node p
2
and from node p
2
to node p
1
. On a machine with a high message latency, this algorithm could be costly, as a rel-
atively large number of small messages are transferred. The cost of the algorithm can be reduced, but with a penalty of increased message size and algorithm complexity.
To do this the nodes must exchange more than a single element at each step, sending a tree of elements with each leaf of the tree corresponding to a result of the next several
possible comparison operations[Zhou et al. 1993]. This method has not been imple- mented as the practical cost of the find-exact algorithm was found to be very small on
§1.2 Algorithm Details 18
the CM5 and AP1000. We assume for the remainder of the discussion on the merge-exchange algorithm
that after the find exact algorithm has completed it has been determined that node p
1
must retain L
1
elements and must transfer L
2
elements from node p
2
.
1.2.10 Transferring Elements