Fragment-and-Replicate Join

18.5.2.2 Fragment-and-Replicate Join

Partitioning is not applicable to all types of joins. For instance, if the join condition is an inequality, such as r ✶ r.a<s.b s , it is possible that all tuples in r join with some tuple in s (and vice versa). Thus, there may be no easy way of partitioning r and s so that tuples in partition r i join with only tuples in partition s i .

We can parallelize such joins by using a technique called fragment and replicate.

We first consider a special case of fragment and replicate— asymmetric fragment-

and-replicate join —which works as follows:

1. The system partitions one of the relations—say, r . Any partitioning tech- nique can be used on r , including round-robin partitioning.

2. The system replicates the other relation, s, across all the processors.

3. Processor P i then locally computes the join of r i with all of s, using any join technique.

The asymmetric fragment-and-replicate scheme appears in Figure 18.3a. If r is already stored by partitioning, there is no need to partition it further in step 1. All that is required is to replicate s across all processors.

The general case of fragment-and-replicate join appears in Figure 18.3b; it works this way: The system partitions relation r into n partitions, r 0 , r 1 ,..., r n− 1 , and partitions s into m partitions, s 0 , s 1 ,..., s m− 1 . As before, any partitioning technique may be used on r and on s. The values of m and n do not need to

be equal, but they must be chosen so that there are at least m ∗ n processors. Asymmetric fragment and replicate is simply a special case of general fragment and replicate, where m = 1. Fragment and replicate reduces the sizes of the relations at each processor, compared to asymmetric fragment and replicate.

18.5 Intraoperation Parallelism 809

P n–1,m–1 (a) Asymmetric

r n–1

(b) Fragment and replicate fragment and replicate

Figure 18.3 Fragment-and-replicate schemes.

Let the processors be P 0,0 , P 0,1 ,..., P 0,m−1 , P 1,0 ,..., P n− 1,m−1 . Processor P i, j computes the join of r i with s j . Each processor must get those tuples in the partitions on which it works. To accomplish this, the system replicates r i to pro-

cessors P i, 0 , P i, 1 ,..., P i,m− 1 (which form a row in Figure 18.3b), and replicates s i to processors P 0,i , P 1,i ,..., P n− 1,i (which form a column in Figure 18.3b). Any join technique can be used at each processor P . i, j

Fragment and replicate works with any join condition, since every tuple in r can be tested with every tuple in s. Thus, it can be used where partitioning cannot be.

Fragment and replicate usually has a higher cost than partitioning when both relations are of roughly the same size, since at least one of the relations has to be replicated. However, if one of the relations—say, s —is small, it may be cheaper to replicate s across all processors, rather than to repartition r and s on the join attributes. In such a case, asymmetric fragment and replicate is preferable, even though partitioning could be used.