- 193 -
ifarr[ lo ].order middle.order ...
This last quicksort gives a further improvement in time see Table 9-1
. Overall, this tuning example shows that by avoiding the casts by implementing a standard sort algorithm and
comparison method specifically for a particular class, you can more than double the speed of the sort with little effort. For comparison, I have included in
Table 9-1 the timings for using the
Arrays.sort
method, applied to the same randomized list of
Sortable
objects used in the example. The
Arrays.sort
method uses a merge sort that performs better on a partially sorted list. Merge sort was chosen for
Arrays.sort
because, although quicksort provides better performance on average, the merge sort provides sort stability. A stable sort does not alter the order
of elements that are equal based on the comparison method used.
[1] [1]
The standard quicksort algorithm also has very bad worst-case performance. There are quicksort variations that improve the worst-case performance.
For more specialized and optimized sorts, there are books including Java-specific ones covering various sort algorithms, and a variety of sort implementations available on the Web. The computer
literature is full of articles providing improved sorting algorithms for specific types of data, and you may need to run a search to find specialized sorts for your particular application. A good place to
start is with the classic reference The Art of Computer Programming by Donald Knuth.
In the case of nonarray elements such as linked-list structures, a recursive merge sort is the best sorting algorithm and can be faster than a quicksort on arrays with the same dataset. Note that the
JDK
Collections.sort
methods are suboptimal for linked lists. The
Collections.sortList
method converts the list into an array before sorting it, which is the wrong strategy to sort linked lists, as shown in an article by John Boyer.
[2]
Boyer also shows that a binary search on a linked list is significantly better than a linear search if the cost of comparisons is
more than about two or three node traversals, as is typically the case.
[2]
Sorting and Searching Linked Lists in Java, Dr. Dobbs Journal, May 1998.
If you need your sort algorithm to run faster, optimizing the comparisons in the sort method is a good place to start. This can be done in several ways:
•
Eliminating casts by specifying data types more precisely.
•
Modifying the comparison algorithm to be quicker.
•
Replacing the objects with wrappers that compare faster e.g.,
java.text.CollationKey
s. These are best used when the comparison method requires a calculation for each object being compared, and that calculation can be cached.
•
Eliminating methods by accessing fields directly.
•
Partially presorting the array with a faster partial sort, followed by the full sort. Only when the performance is still short of your target do you need to start looking for alternatives.
Several of the techniques listed here have been applied in the earlier example, and also in the internationalized string sorting example in
Section 5.6 .
9.2 An Efficient Sorting Framework
The sorting methods provided by the JDK are perfectly adequate for most situations, and when they fall short, the techniques illustrated in the previous section often speed things up as much as is
required. However, if you work on a project where varied and flexible sorting capabilities are needed, sorting is one area of performance tuning where it is sensible to create a framework early
during the development cycle. A good sorting framework should allow you to change sorting-
- 194 - algorithm and comparison-ordering methods in a generic way, without having to change too much
in the application.
Providing support for arbitrary sorting algorithms is straightforward: just use sorting interfaces. There needs to be a sorting interface for each type of object that can be sorted. Arrays and
collection objects should be supported by any sorting framework, along with any other objects that are specific to your application. Here are two interfaces that define sorting objects for arrays and
collections:
import java.util.Comparator; import java.util.Collection;
public interface ArraySorter {
public void sortComparator comparator, Object[] arr; public void sortComparator comparator, Object[] arr,
int startIndex, int length; public void sortIntoComparator comparator, Object[] source,
int sourceStartIndex, int length, Object[] target, int targetStartIndex;
} public interface CollectionSorter
{ public Object[] sortComparator comparator, Collection c;
public void sortIntoComparator comparator, Collection c, Object[] target, int targetStartIndex;
}
Individual classes that implement the interfaces are normally stateless, and hence implicitly thread- safe. This allows you to specify singleton sorting objects for use by other objects. For example:
public class ArrayQuickSorter implements ArraySorter
{ public static final ArrayQuickSorter SINGLETON = new ArrayQuickSorter ;
protect the constructor so that external classes are forced to use the singleton
protected ArrayQuickSorter {} public void sortIntoComparator comparator, Object[] source,
int sourceStartIndex, int length, Object[] target, int targetStartIndex {
Only need the target - quicksort sorts in place. if source == target sourceStartIndex == targetStartIndex
System.arraycopysource, sourceStartIndex, target, targetStartIndex, length;
this.sortcomparator, target, targetStartIndex, length; }
public void sortComparator comparator, Object[] arr {
this.sortcomparator, arr, 0, arr.length; }
public void sortComparator comparator, Object[] arr, int startIndex, int length
{ quicksort algorithm implementation using Comparator.compareObject,
Object
- 195 -
... }
This framework allows you to change the sort algorithm simply by changing the sort object you use. For example, if you use a quicksort but realize that your array is already partially sorted, simply
change the sorter instance from
ArrayQuickSorter.SINGLETON
to
ArrayInsertionSorter.SINGLETON
. However, we are only halfway to an efficient framework. Although the overall sorting structure is
here, you have not supported generic optimizations such as optimized comparison wrappers e.g., as with
java.text.CollationKey
. For generic support, you need the
Comparator
interface to have an additional method that checks whether it supports optimized comparison wrappers which I will
now call
ComparisonKey
s . Unfortunately, you cannot add a method to the
Comparator
interface, so you have to use the following subinterface:
public interface KeyedComparator extends Comparator
{ public boolean hasComparisonKeys ;
public ComparisonKey getComparisonKeyObject o; }
public interface ComparisonKey {
public int compareToComparisonKey target; public Object getSource ;
}
Now you need to support this addition to the framework in each sorter object. Since you dont want to change all your sorter-object implementations again and again, its better to find any further
optimizations now. One optimization is a sort that avoids a call to any method comparison. You can support that with a specific
ComparisonKey
class:
public class IntegerComparisonKey implements ComparisonKey
{ public Object source;
public int order; public IntegerComparisonKeyObject source, int order {
this.source = source; this.order = order;
} public int compareToComparisonKey target{
return order - IntegerComparisonKey target.order; }
public Object getSource {return source;} }
Now you can reimplement your sorter class to handle these special optimized cases. Only the method that actually implemented the sort needs to change:
public class ArrayQuickSorter implements ArraySorter
{ everything else as previously
... public void sortComparator comparator, Object[] arr,
- 196 -
int startIndex, int length {
If the comparator is part of the extended framework, handle the special case where it recommends using comparison keys
if comparator instanceof KeyedComparator KeyedComparator comparator.hasComparisonKeys
{ wrap the objects in the ComparisonKeys
but if the ComparisonKey is the special case of IntegerComparisonKey, handle that specially
KeyedComparator comparer = KeyedComparator comparator; ComparisonKey first = comparer.getComparisonKeyarr[startIndex];
if first instanceof IntegerComparisonKey {
wrap in IntegerComparisonKeys IntegerComparisonKey[] iarr = new IntegerComparisonKey[length];
iarr[startIndex] = IntegerComparisonKey first; forint j = length-1, i = startIndex+length-1; j 0; i--, j--
iarr[j] = comparer.getComparisonKeyarr[i]; sort using the optimized sort for IntegerComparisonKeys
sort_intkeysiarr, 0, length; and unwrap
forint j = length-1, i = startIndex+length-1; j = 0; i--, j-- arr[i] = iarr[j].source;
} else
{ wrap in IntegerComparisonKeys
ComparisonKey[] karr = new ComparisonKey[length]; karr[startIndex] = first;
forint j = length-1, i = startIndex+length-1; j 0; i--, j-- karr[i] = comparer.getComparisonKeyarr[i];
sort using the optimized sort for ComparisonKeys sort_keyskarr, 0, length;
and unwrap forint j = length-1, i = startIndex+length-1; j = 0; i--, j--
arr[i] = karr[i].getSource ; }
} else
just use the original algorithm sort_comparatorcomparator, arr, startIndex, length;
} public void sort_comparatorComparator comparator, Object[] arr,
int startIndex, int length {
quicksort algorithm implementation using Comparator.compareObject, Object
... }
public void sort_keysComparisonKey[] arr, int startIndex, int length {
quicksort algorithm implementation using ComparisonKey.compareComparisonKey
... }
public void sort_intkeysIntegerComparisonKey[] arr, int startIndex, int length
{ quicksort algorithm implementation comparing key order directly
- 197 -
using access to the IntegerComparisonKey.order field i.e if arr[i].order arr[j].order
... }
}
Although the special cases mean that you have to implement the same algorithm three times with slight changes to data type and comparison method, this is the kind of tradeoff you often have to
make for performance optimizations. The maintenance impact is limited by having all implementations in one class, and once youve debugged the sort algorithm, you are unlikely to
need to change it.
This framework now supports:
•
An easy way to change the sorting algorithm being used at any specific point of the application.
•
An easy way to change the pair-wise comparison method, by changing the
Comparator
object.
•
Automatic support for comparison key objects. Comparison keys are optimal to use in sorts where the comparison method requires a calculation for each object being compared, and
that calculation could be cached.
•
An optimized integer key comparison class, which doesnt require method calls when used for sorting.
This outline should provide a good start to building an efficient sorting framework. Many further generic optimizations are possible, such as supporting a
LongComparisonKey
class and other special classes appropriate to your application. The point is that the framework should handle
optimizations automatically. The most the application builder should do is decide on the appropriate
Comparator
or
ComparisonKey
class to build for the object to be sorted. The last version of our framework supports the fastest sorting implementation from the previous
section the last implementation with no casts and direct access to the ordering field. Unfortunately, the cost of creating an
IntegerComparisonKey
object for each object being sorted is significant enough to eliminate the speedup from getting rid of the casts. Its worth looking at ways
to reduce the cost of object creations for comparison keys. This cost can be reduced using the object-to-array mapping technique from
Chapter 4 : the array of
IntegerComparisonKey
s is changed to a pair of
Object
and
int
arrays. By adding another interface you can support the needed mapping:
interface RawIntComparator extends not actually necessary, but logically applies
extends KeyedComparator {
public void getComparisonKeyObject o, int[] orders, int idx; }
For the example
Sortable
class that was defined earlier, you can implement a
Comparator
class:
public class SortableComparator implements RawIntComparator
{ Required for Comparator interface
public int compareObject o1, Object o2{ return Sortable o1.order -Sortable o2.order;}
Required for Comparator interface
- 198 -
public boolean hasComparisonKeys {return true;} public ComparisonKey getComparisonKeyObject o{
return new IntegerComparisonKeyo, Sortable o.order;} Required for RawIntComparator interface
public void getComparisonKeyObject s, int[] orders, int index{ orders[index] = Sortable s.order;}
}
Then the logic to support the
RawIntComparator
in the sorting class is:
public class ArrayQuickSorter implements ArraySorter
{ everything else as previously except rename the
previously defined sortComparator, Object[], int, int method as previous_sort
... public void sortComparator comparator, Object[] arr,
int startIndex, int length {
support RawIntComparator types if comparator instanceof RawIntComparator
{ RawIntComparator comparer = RawIntComparator comparator;
Object[] sources = new Object[length]; int[] orders = new int[length];
forint j = length-1, i = startIndex+length-1; j = 0; i--, j-- {
comparer.getComparisonKeyarr[i], orders, j; sources[j] = arr[i];
} sort using the optimized sort with no casts
sort_intkeyssources, orders, 0, length; and unwrap
forint j = length-1, i = startIndex+length-1; j = 0; i--, j-- arr[i] = sources[j];
} else
previous_ sortcomparator, arr, startIndex, length; }
public void sort_intkeysObject[] sources, int[] orders, int startIndex, int length
{ quicksortsources, orders, startIndex, startIndex+length-1;
} public static void quicksortObject[] sources, int[] orders, int lo, int hi
{ quicksort algorithm implementation with a pair of
synchronized arrays. orders is the array used to compare ordering. sources is the array holding the
source objects whicn needs to be altered in synchrony with orders
if lo = hi return;
int mid = lo + hi 2; Object tmp_o;
- 199 -
int tmp_i; int middle = orders[ mid ];
if orders[ lo ] middle {
orders[ mid ] = orders[ lo ]; orders[ lo ] = middle;
middle = orders[ mid ]; tmp_o = sources[mid];
sources[ mid ] = sources[ lo ]; sources[ lo ] = tmp_o;
} if middle orders[ hi ]
{ orders[ mid ] = orders[ hi ];
orders[ hi ] = middle; middle = orders[ mid ];
tmp_o = sources[mid]; sources[ mid ] = sources[ hi ];
sources[ hi ] = tmp_o; if orders[ lo ] middle
{ orders[ mid ] = orders[ lo ];
orders[ lo ] = middle; middle = orders[ mid ];
tmp_o = sources[mid]; sources[ mid ] = sources[ lo ];
sources[ lo ] = tmp_o; }
} int left = lo + 1;
int right = hi - 1; if left = right
return; for ;;
{ while orders[ right ] middle
{ right--;
} while left right orders[ left ] = middle
{ left++;
} if left right
{ tmp_i = orders[ left ];
orders[ left ] = orders[ right ]; orders[ right ] = tmp_i;
tmp_o = sources[ left ]; sources[ left ] = sources[ right ];
sources[ right ] = tmp_o; right--;
} else
{ break;
- 200 -
} }
quicksortsources, orders, lo, left; quicksortsources, orders, left + 1, hi;
} }
With this optimization, the framework quicksort is now as fast as the fastest handcrafted quicksort from the previous section see