Conclusions Parallel Disk Systems

§1.5 Conclusions 33 To give a fairer comparison I again replaced the quicksort algorithm in my imple- mentation with the American flag in-place radix sort and scaled for AP1000 to CM5 differences. This gave an equivalent sorting time of 3.9 seconds.

1.5 Conclusions

The parallel sorting algorithm presented in this chapter arises from a very simple analysis following a decision to split the algorithm into a local sorting phase followed by a parallel merge. The use of a fast but incorrect initial merge phase followed by a simple cleanup algorithm leads to a very high degree of parallel efficiency. As far as I am aware this is the only efficient, comparison-based distributed mem- ory parallel sorting algorithm that requires less than order N temporary storage. These features, combined with the competitive sorting performance, make the algorithm suitable for a parallel algorithms library. Chapter 2 External Parallel Sorting This chapter considers the problem of external parallel sorting. External sorting in- volves sorting more data than can fit in the combined memory of all the processors on the machine. This involves using disk as a form of secondary memory, and it presents some very interesting challenges because of the huge difference between the band- widths and latencies of memory and disk systems.

2.1 Parallel Disk Systems

There are a large variety of different disk IO systems available for parallel computers. In some systems each CPU has its own private disk and in others there is a central bank of disks accessible from all CPUs. For this work I assumed the following setup: • a distributed memory parallel computer • a parallel filesystem where all CPUs can access all the data The particular setup I used was a 128 cell AP1000 with 32 disks and the HiDIOS parallel filesystem[Tridgell and Walsh 1996]. The requirement of a filesystem where all CPUs can access all the data isn’t as much of an impediment as it might first seem. Almost any distributed filesystem can easily be given this property using a remote access protocol coupled to the message passing library 1 . 1 The AP1000 has only local disk access in hardware. Remote disk access is provided by the operating system. 34 §2.2 Designing an algorithm 35