Partitioned Join

18.5.2.1 Partitioned Join

For certain kinds of joins, such as equi-joins and natural joins, it is possible to partition the two input relations across the processors and to compute the join locally at each processor. Suppose that we are using n processors and that the relations to be joined are r and s. Partitioned join then works this way: The system

partitions the relations r and s each into n partitions, denoted r 0 , r 1 ,..., r n− 1 and s 0 , s 1 ,..., s n− 1 . The system sends partitions r i and s i to processor P i , where their join is computed locally. The partitioned join technique works correctly only if the join is an equi-join (for example, r ✶ r. A=s.B s ) and if we partition r and s by the same partitioning function on their join attributes. The idea of partitioning is exactly the same as that behind the partitioning step of hash join. In a partitioned join, however, there are two different ways of partitioning r and s:

• Range partitioning on the join attributes. • Hash partitioning on the join attributes.

In either case, the same partitioning function must be used for both relations. For range partitioning, the same partition vector must be used for both relations. For hash partitioning, the same hash function must be used on both relations. Figure 18.2 depicts the partitioning in a partitioned parallel join.

Once the relations are partitioned, we can use any join technique locally at each processor P i to compute the join of r i and s i . For example, hash join, merge join, or nested-loop join could be used. Thus, we can use partitioning to parallelize any join technique.

Figure 18.2 Partitioned parallel join.

808 Chapter 18 Parallel Databases

If one or both of the relations r and s are already partitioned on the join attributes (by either hash partitioning or range partitioning), the work needed for partitioning is reduced greatly. If the relations are not partitioned, or are partitioned on attributes other than the join attributes, then the tuples need to

be repartitioned. Each processor P i reads in the tuples on disk D i , computes for each tuple t the partition j to which t belongs, and sends tuple t to processor P j . Processor P j stores the tuples on disk D j .

We can optimize the join algorithm used locally at each processor to reduce I/O by buffering some of the tuples to memory, instead of writing them to disk. We describe such optimizations in Section 18.5.2.3.

Skew presents a special problem when range partitioning is used, since a partition vector that splits one relation of the join into equal-sized partitions may split the other relations into partitions of widely varying size. The partition vector should be such that |r i |+| s i | (that is, the sum of the sizes of r i and s i ) is roughly equal over all the i = 0, 1, . . . , n − 1. With a good hash function, hash partitioning is likely to have a smaller skew, except when there are many tuples with the same values for the join attributes.

Partitioned Join

18.5.2.1 Partitioned Join

Parts

Dokumen yang terkait

View of pengaruh gaya kepemimpinan demokratis terhadap kinerja pegawai pada kantor sekretariat daerah kota samarinda

View of ALFIANA SARTIKA PENDAFTARAN PENDUDUK PINDAH DATANG DI KELURAHAN SANGASANGA DALAM KECAMATAN SANGASANGA KABUPATEN KUTAI KARTANEGARA

View of Dampak Kebijakan Pertambangan Batu Bara Bagi Masyarakat Bengkuring Kelurahan Sempaja Selatan Kecamatan Samarinda Utara

View of Implementasi Dana Bantuan PT. Kideco Jaya Agung Dalam Pembangunan Di Desa Sempulang Kecamatan Tanah Grogot Kabupaten Paser

PENGELOLAAN DANA PNPM MANDIRI PEDESAAN DI KELURAHAN WARU KECAMATAN WARU KABUPATEN PENAJAM PASER UTARA (Studi Tentang Program Dana Bergulir Simpan Pinjam Untuk Perempuan SPP) Min Anwar Rasyid1 Abstrak - View of PENGELOLAAN DANA PNPM MANDIRI PEDESAAN DI KEL

Evaluation of Teacher-Student Learning Style Disparity in Construction Management Education

Conceptions of early leaving a comparison of the views of teaching staff and.pdf

The Englishes of English tests: bias revisited

Research and Trends in the Studies of Native Non-native Speaker Teachers of Languages: A Review on Selected Researches and Theses

Variety of intelligence test (‘IQ’) tests IQ scores= good means of predicting

Dukungan

Links

Partitioned Join

18.5.2.1 Partitioned Join

Parts

Dokumen yang terkait

View of pengaruh gaya kepemimpinan demokratis terhadap kinerja pegawai pada kantor sekretariat daerah kota samarinda

View of ALFIANA SARTIKA PENDAFTARAN PENDUDUK PINDAH DATANG DI KELURAHAN SANGASANGA DALAM KECAMATAN SANGASANGA KABUPATEN KUTAI KARTANEGARA

View of Dampak Kebijakan Pertambangan Batu Bara Bagi Masyarakat Bengkuring Kelurahan Sempaja Selatan Kecamatan Samarinda Utara

View of Implementasi Dana Bantuan PT. Kideco Jaya Agung Dalam Pembangunan Di Desa Sempulang Kecamatan Tanah Grogot Kabupaten Paser

PENGELOLAAN DANA PNPM MANDIRI PEDESAAN DI KELURAHAN WARU KECAMATAN WARU KABUPATEN PENAJAM PASER UTARA (Studi Tentang Program Dana Bergulir Simpan Pinjam Untuk Perempuan SPP) Min Anwar Rasyid1 Abstrak - View of PENGELOLAAN DANA PNPM MANDIRI PEDESAAN DI KEL

Evaluation of Teacher-Student Learning Style Disparity in Construction Management Education

Conceptions of early leaving a comparison of the views of teaching staff and.pdf

The Englishes of English tests: bias revisited

Research and Trends in the Studies of Native Non-native Speaker Teachers of Languages: A Review on Selected Researches and Theses

Variety of intelligence test (‘IQ’) tests IQ scores= good means of predicting

Dokumen yang Anda mencari sudah siap untuk unduhkan