Replacing JDK Classes Underlying JDK Improvements

- 56 - Sophisticated generational garbage collectors, which smooth out the impact of the garbage collector, are now being used; HotSpot uses a state-of-the-art generational garbage collector. Analysis of object-oriented programs has shown that most objects are short-lived, fewer have medium lifespans, and very few objects are long-lived. Generational garbage collectors move objects through multiple spaces, each time copying live objects from one space to the next and reclaiming the space used by objects that are no longer alive. By concentrating on short-lived objects—the early spaces—and spending less time recycling space where older objects live, the garbage collector frees the maximum amount of space for the lowest impact. [1] [1] One book giving more details on garbage collection is Inside the Java 2 Virtual Machine by Bill Venners McGraw-Hill. The garbage collection chapter is also available online at http:www.artima.com . Because the garbage collector is different in different VM versions, the output from the - verbosegc option is also likely to change across versions, making it difficult to compare the effects of the garbage collectors across versions not to mention between different vendors VMs. But you should still attempt this comparison, as the effect of the garbage collector can make a difference to the application. Looking at garbage-collection output can tell you that parts of your application are causing significantly more work for the garbage collector, suggesting you may want to alter the flow of objects in those parts of the application. Garbage collection is also affected by the number of threads and whether objects are shared across threads. Expect to see improvements in threaded garbage collection over different VM versions. A JDK bug seems to prevent the garbage collection of threads until the Thread.stop method has been called on the terminated thread this is true even though the Thread.stop method has been deprecated in Java 2. This affects performance because the resources used by the thread are not released until the thread is garbage- collected. Ultimately, if you use many short-lived threads in your application, the system will run out of resources and will not supply any further threads. See Alan Williamsons article in the Java Developers Journal, July 1999 and November 1999. Garbage-collection times may be affected by the size of the VM memory. A larger memory implies there will be more objects in the heap space before the garbage collector needs to kick in. This in turn means that the process of sweeping dead objects takes longer, as does the process of running through a larger object table. Different VMs have optimal performance at different sizes of the VM, and the optimal size for any particular application-VM pairing must unfortunately be determined by trial and error.

3.2 Replacing JDK Classes

It is possible for you to replace JDK classes directly. Unfortunately, you cant distribute these altered classes with any application or applet unless you have complete control of the target environment. Although you often do have this control with in-house and enterprise-developed applications, most enterprises prefer not to deploy alterations to externally built classes. The alterations then would not be supported by the vendor Sun in this case and may violate the license, so contact the vendor if you need to do this. In addition, altering classes in this way can be a significant maintenance problem. [2] [2] If your application has its classes localized in one place on one machine, for example with servlets, you might consider deploying changes to the core classes. The upshot is that you can easily alter JDK-supplied classes for development purposes, which can be useful for various reasons including debugging and tuning. But if you need the functionality in your deployed application, you need to provide classes that are used instead of the JDK classes by redirecting method calls into your own classes. - 57 - Replacing JDK classes indirectly in this way is a valid tuning technique. Some JDK classes, such as StreamTokenizer see Section 5.4 , are inefficient and can be replaced quite easily since you normally use them in small, well-defined parts of a program. Other JDK classes, like Date , BigDecimal , and String are used all over the place, and it can take a large effort to replace references with your own versions of these classes. The best way to replace these classes is to start from the design stage, so that you can consistently use your own versions throughout the application. In Version 1.3 of the JDK, many of the java.lang.Math methods were changed from native to call the corresponding methods in java.lang.StrictMath . StrictMath provides bitwise consistency across platforms; earlier versions of Math used the platform-specific native functions that were not identical across all platforms. Unfortunately, StrictMath calculations are somewhat slower than the corresponding native functions. My colleague Kirk Pepperdine, who first pointed out the performance problem to me, puts it this way: Ive now got a bitwise-correct but excruciatingly slow program. The potential workarounds to this performance issue are all ugly: using an earlier JDK version, replacing the JDK class with an earlier version, or writing your own class to manage faster alternative floating-point calculations. For optimal performance, I recommend developing with your own versions of classes rather than the JDK versions whenever possible. This gives maximum tuning flexibility. However, this recommendation is clearly impractical in most cases. Given that, perhaps the single most significant class to replace with your own version is the String class. Most other classes can be replaced inside identified bottlenecks when required during tuning, without affecting other parts of the application. But String is used so extensively that replacing String references in one location tends to have widespread consequences, requiring extensive rewriting in many parts of the application. In fact, this observation also applies to other data type classes you use extensively Integer , Date , etc.. But the String class tends to be the most often used of these classes. See Chapter 5 for details on why the String class can be a performance problem, and why you might need to replace it. It is often impractical to replace the String classes where their internationalization capabilities are required. Because of this, you should logically partition the applications use of String s to identify those aspects that require internationalization and those aspects that are really character processing, independent of language dependencies. The latter usage of String s can be replaced more easily than the former. Internationalization -dependent String manipulation is difficult to tune, because you are dependent on internationalization libraries that are difficult to replace. Many JDK classes provide generic capabilities as you would expect from library classes, and so they are frequently more generic than what is required for your particular application. These generic capabilities often come at the expense of performance. For example, Vector is fine for generic Object s, but if you are using a Vector for only one type of object, then a custom version with an array and accessors of that type is faster, as you can avoid all the casts required to convert the generic Object back into your own type. Using Vector for basic data types e.g., long s is even worse, requiring the data type to be wrapped by an object to get it into the Vector . For example, building and using a LongVector class improves performance and readability by avoiding casts, Long wrappers, unwrapping, etc.: public class LongVector { long[] internalArray; int arraySize - 58 - ... public void addElementlong l { ... public long elementAtint i { ... If you are using your own classes, you can extend them to have the specific functionality you require, with direct access to the internals of the class. Again using Vector as an example, if you want to iterate over the collection e.g., to select a particular subset based on some criteria, you need to access the elements through the get method for each element, with the significant overhead that that implies. If you are using your own possibly derived class, you can implement the specific action you want in the class, allowing your loop to access the internal array directly with the consequent speedup: public class QueryVector extends MyVector { public Object[] getTheBitsIWant{ Access the internal array directly rather than going through the method accessors. This makes the search much faster Object[] results = new Object[10]; forint i = arraySize-1; i = 0; i-- if internalArray[i] .... Finally, there are often many places where objects especially collection objects are used initially for convenience e.g., Vector , because you did not know the size of the array you would need, etc., and in a final version of the application can be replaced completely with presized arrays. A known- sized array not a collection object is the fastest way in Java to store and access elements of a collection.

3.3 Faster VMs