Java 2 Collections Appropriate Data Structures and Algorithms

- 236 - integerArrayList.setsomeIndex, new IntegersomeNum; int num = Integer integerArrayList.getsomeIndex, someNum.intValue ; For this example, the cost of creating a new Integer object to wrap the int makes setting values take more than ten times longer when using the generalized array. Accessing is not as bad, taking only twice as long after including the extra cast and method access to get to the int .

11.2 Java 2 Collections

The collections framework introduced with Java 2 comes with a set of collection classes in the JDK. Each class has its own performance strengths and weaknesses, which I cover here. The collection implementations use the synchronized-wrapper framework to provide synchronized classes; otherwise, the implementations are unsynchronized except for two exceptions noted shortly. Collection classes wrapped in synchronized wrappers are always slower than unwrapped, unsynchronized classes. Nevertheless, my recommendation is generally to use objects within synchronized wrappers. You can selectively unwrap objects when they have been identified as part of a bottleneck and when the synchronization is not necessary. The performance aspects of thread-safe collections are discussed in detail in Chapter 10 . Synchronized wrappers are also discussed in that chapter, in Section 10.4.1 . Table 11-1 summarizes the performance attributes of the collection classes. Table 11-1, Performance Attributes of Java 2 Collection Classes Interface Class Synchronized? Set HashSet No Fastest Set; slower than HashMap but implements the Set interface HashMap does not TreeSet No Slower than HashSet; provides iteration of keys in order Map HashMap No Fastest Map Hashtable Yes Slower than HashMap, but faster than synchronized HashMap TreeMap No Slower than Hashtable and HashMap; provides iteration of keys in order List ArrayList No Fastest List Vector Yes Slower than ArrayList, but faster than synchronized ArrayList Stack Yes Same speed as Vector; provides LIFO queue functionality LinkedList No Slower than other Lists, but may be faster for some types of queue Implementations of Set are slower to update than most other collection objects and should be avoided unless you need Set functionality. Of the two available Set implementations, HashSet is definitely faster than TreeSet . HashSet uses an underlying HashMap , so the way HashSet maintains uniqueness is extremely straightforward. Objects are added to the set as the keys to the HashMap , so there is no need to search the set for the elements. This optimizes unique element addition. If you need Set functionality but not specifically a Set implementation, it is faster to use a HashMap directly. Map has three general-purpose implementations, Hashtable , HashMap , and TreeMap . In addition, there are several specialized implementations that do not provide any performance improvements. [6] TreeMap is significantly slower than the other two Map s, and should not be used unless you need the extra functionality of iterating ordered keys. Hashtable is a synchronized Map , and HashMap is an unsynchronized Map . Hashtable is present for backward compatibility with earlier versions of the JDK. Nevertheless, if you need to use a synchronized Map , a Hashtable is faster than using a HashMap in a synchronized wrapper. - 237 - [6] Attributes simply wraps a HashMap , and restricts the keys to be ASCII-character alphanumeric String s, and values to be String s. WeakHashMap can maintain a cache of elements that are automatically garbage-collected when memory gets low. RenderingHints is specialized for use within the AWT packages. Properties is a Hashtable subclass specialized for maintaining key value string pairs in files. UIDefaults is specialized for use within the Swing packages Hashtable , HashMap , and HashSet are all O1 for access and update, so they should scale nicely if you have the available memory space. List has four general-purpose implementations, Vector , Stack , ArrayList , and LinkedList . Vector , Stack , and ArrayList have underlying implementations based on arrays. LinkedList has an underlying implementation consisting of doubly linked list. As such, LinkedList s performance is worse than any of the other three List s for most operations. For very large collections that you cannot presize to be large enough, LinkedList s provides better performance when adding or deleting elements towards the middle of the list, if the array-copying overhead of the other List s is higher than the linear access time of the LinkedList . Otherwise, LinkedList s only likely performance advantage is as a first-in-first-out queue or double-ended queue. A circular array-list implementation provides better performance for a FIFO queue. Vector is a synchronized List , and ArrayList is an unsynchronized List . Vector is present for backward compatibility with earlier versions of the JDK. Nevertheless, if you need to use a synchronized List , a Vector is faster than using an ArrayList in a synchronized wrapper. See the comparison test at the end of Section 10.4.1 . Stack is a subclass of Vector with the same performance characteristics, but with additional functionality as a last-in-first-out queue.

11.3 Hashtables and HashMaps