- 56 - Sophisticated generational garbage collectors, which smooth out the impact of the garbage
collector, are now being used; HotSpot uses a state-of-the-art generational garbage collector. Analysis of object-oriented programs has shown that most objects are short-lived, fewer have
medium lifespans, and very few objects are long-lived. Generational garbage collectors move objects through multiple spaces, each time copying live objects from one space to the next and
reclaiming the space used by objects that are no longer alive. By concentrating on short-lived objects—the early spaces—and spending less time recycling space where older objects live, the
garbage collector frees the maximum amount of space for the lowest impact.
[1] [1]
One book giving more details on garbage collection is Inside the Java 2 Virtual Machine by Bill Venners McGraw-Hill. The garbage collection chapter is also available online at
http:www.artima.com .
Because the garbage collector is different in different VM versions, the output from the
- verbosegc
option is also likely to change across versions, making it difficult to compare the effects of the garbage collectors across versions not to mention between different vendors VMs. But you
should still attempt this comparison, as the effect of the garbage collector can make a difference to the application. Looking at garbage-collection output can tell you that parts of your application are
causing significantly more work for the garbage collector, suggesting you may want to alter the flow of objects in those parts of the application. Garbage collection is also affected by the number
of threads and whether objects are shared across threads. Expect to see improvements in threaded garbage collection over different VM versions.
A JDK bug seems to prevent the garbage collection of threads until the
Thread.stop
method has been called on the terminated thread this is true even though the
Thread.stop
method has been deprecated in Java 2. This affects performance because the resources used by the thread are not released until the thread is garbage-
collected. Ultimately, if you use many short-lived threads in your application, the system will run out of resources and will not supply any further threads. See Alan
Williamsons article in the Java Developers Journal, July 1999 and November 1999.
Garbage-collection times may be affected by the size of the VM memory. A larger memory implies there will be more objects in the heap space before the garbage collector needs to kick in. This in
turn means that the process of sweeping dead objects takes longer, as does the process of running through a larger object table. Different VMs have optimal performance at different sizes of the VM,
and the optimal size for any particular application-VM pairing must unfortunately be determined by trial and error.
3.2 Replacing JDK Classes
It is possible for you to replace JDK classes directly. Unfortunately, you cant distribute these altered classes with any application or applet unless you have complete control of the target
environment. Although you often do have this control with in-house and enterprise-developed applications, most enterprises prefer not to deploy alterations to externally built classes. The
alterations then would not be supported by the vendor Sun in this case and may violate the license, so contact the vendor if you need to do this. In addition, altering classes in this way can be a
significant maintenance problem.
[2] [2]
If your application has its classes localized in one place on one machine, for example with servlets, you might consider deploying changes to the core classes.
The upshot is that you can easily alter JDK-supplied classes for development purposes, which can be useful for various reasons including debugging and tuning. But if you need the functionality in
your deployed application, you need to provide classes that are used instead of the JDK classes by redirecting method calls into your own classes.
- 57 - Replacing JDK classes indirectly in this way is a valid tuning technique. Some JDK classes, such as
StreamTokenizer
see Section 5.4
, are inefficient and can be replaced quite easily since you normally use them in small, well-defined parts of a program. Other JDK classes, like
Date
,
BigDecimal
, and
String
are used all over the place, and it can take a large effort to replace references with your own versions of these classes. The best way to replace these classes is to start
from the design stage, so that you can consistently use your own versions throughout the application.
In Version 1.3 of the JDK, many of the
java.lang.Math
methods were changed from
native
to call the corresponding methods in
java.lang.StrictMath
.
StrictMath
provides bitwise consistency across platforms; earlier versions of
Math
used the platform-specific native functions that were not identical across all platforms.
Unfortunately,
StrictMath
calculations are somewhat slower than the corresponding native functions. My colleague Kirk Pepperdine, who first pointed out the performance
problem to me, puts it this way: Ive now got a bitwise-correct but excruciatingly slow program. The potential workarounds to this performance issue are all ugly: using an
earlier JDK version, replacing the JDK class with an earlier version, or writing your own class to manage faster alternative floating-point calculations.
For optimal performance, I recommend developing with your own versions of classes rather than the JDK versions whenever possible. This gives maximum tuning flexibility. However, this
recommendation is clearly impractical in most cases. Given that, perhaps the single most significant class to replace with your own version is the
String
class. Most other classes can be replaced inside identified bottlenecks when required during tuning, without affecting other parts of the
application. But
String
is used so extensively that replacing
String
references in one location tends to have widespread consequences, requiring extensive rewriting in many parts of the
application. In fact, this observation also applies to other data type classes you use extensively
Integer
,
Date
, etc.. But the
String
class tends to be the most often used of these classes. See Chapter 5
for details on why the
String
class can be a performance problem, and why you might need to replace it.
It is often impractical to replace the
String
classes where their internationalization capabilities are required. Because of this, you should logically partition the applications use of
String
s to identify those aspects that require internationalization and those aspects that are really character processing,
independent of language dependencies. The latter usage of
String
s can be replaced more easily than the former. Internationalization -dependent
String
manipulation is difficult to tune, because you are dependent on internationalization libraries that are difficult to replace.
Many JDK classes provide generic capabilities as you would expect from library classes, and so they are frequently more generic than what is required for your particular application. These generic
capabilities often come at the expense of performance. For example,
Vector
is fine for generic
Object
s, but if you are using a
Vector
for only one type of object, then a custom version with an array and accessors of that type is faster, as you can avoid all the casts required to convert the
generic
Object
back into your own type. Using
Vector
for basic data types e.g.,
long
s is even worse, requiring the data type to be wrapped by an object to get it into the
Vector
. For example, building and using a
LongVector
class improves performance and readability by avoiding casts,
Long
wrappers, unwrapping, etc.:
public class LongVector {
long[] internalArray; int arraySize
- 58 -
... public void addElementlong l {
... public long elementAtint i {
...
If you are using your own classes, you can extend them to have the specific functionality you require, with direct access to the internals of the class. Again using
Vector
as an example, if you want to iterate over the collection e.g., to select a particular subset based on some criteria, you
need to access the elements through the
get
method for each element, with the significant overhead that that implies. If you are using your own possibly derived class, you can implement
the specific action you want in the class, allowing your loop to access the internal array directly with the consequent speedup:
public class QueryVector extends MyVector {
public Object[] getTheBitsIWant{ Access the internal array directly rather than going through
the method accessors. This makes the search much faster Object[] results = new Object[10];
forint i = arraySize-1; i = 0; i-- if internalArray[i] ....
Finally, there are often many places where objects especially collection objects are used initially for convenience e.g.,
Vector
, because you did not know the size of the array you would need, etc., and in a final version of the application can be replaced completely with presized arrays. A known-
sized array not a collection object is the fastest way in Java to store and access elements of a collection.
3.3 Faster VMs