VM Speed Variations VMs with JIT Compilers

- 58 - ... public void addElementlong l { ... public long elementAtint i { ... If you are using your own classes, you can extend them to have the specific functionality you require, with direct access to the internals of the class. Again using Vector as an example, if you want to iterate over the collection e.g., to select a particular subset based on some criteria, you need to access the elements through the get method for each element, with the significant overhead that that implies. If you are using your own possibly derived class, you can implement the specific action you want in the class, allowing your loop to access the internal array directly with the consequent speedup: public class QueryVector extends MyVector { public Object[] getTheBitsIWant{ Access the internal array directly rather than going through the method accessors. This makes the search much faster Object[] results = new Object[10]; forint i = arraySize-1; i = 0; i-- if internalArray[i] .... Finally, there are often many places where objects especially collection objects are used initially for convenience e.g., Vector , because you did not know the size of the array you would need, etc., and in a final version of the application can be replaced completely with presized arrays. A known- sized array not a collection object is the fastest way in Java to store and access elements of a collection.

3.3 Faster VMs

VM runtimes and Java compilers vary enormously over time and across vendors. More and more optimizations are finding their way into both VMs and compilers. Many possible compiler optimizations are considered in later sections of this chapter. In this section I focus on VM optimizations.

3.3.1 VM Speed Variations

Different VMs have different running characteristics. Some VMs are intended purely for development and are highly suboptimal in terms of performance. These VMs may have huge inefficiencies, even in such basic operations as casting between different numeric types. One development VM I used had this behavior; it provided the foundation of an excellent development environment actually my preferred environment, but was all but useless for performance testing, as any data type manipulation other than with int s or boolean s produced highly varying and misleading times. It is important to run any tests involving timing or profiling in the same VM you plan to run the application. You should test your application in the current standard VMs if your target environment is not fully defined. There is, of course, nothing much you can do about speeding up any one VM short of upgrading the CPUs. But you should be aware of the different VMs available, whether or not you control the deployment environment of your application. If you control the target environment, you can choose your VM appropriately. If you do not control the environment on which your application runs, - 59 - remember that performance is partly user expectation. If you tell your user that VM A gives such and such a performance for your application, but VM B gives this other much slower performance, then you at least inform your user community of the implications of their choice of VM. This could also possibly put pressure on vendors with slower VMs to improve them.

3.3.2 VMs with JIT Compilers

The basic bytecode interpreter VM executes by decoding and executing bytecodes. This is slow, and is pure overhead, adding nothing to the functionality of the application. A just-in-time JIT compiler in a virtual machine eliminates much of this overhead by doing the bytecode fetch and decode just once. The first time the method is loaded, the decoded instructions are converted into machine code native for the CPU the system is running on. After that, future invocations of a particular method no longer incur the interpreter overhead. However, a JIT must be fast at compiling to avoid slowing the runtime, so extensive optimizations within the compile phase are unlikely. This means that the compiled code is often not as fast as it could be. A JIT also imposes a significantly larger memory footprint to the process. Without a JIT, you might have to optimize your bytecodes for a particular platform. Optimizing the bytecode for one platform can conceivably make that code run slower on another platform though a speedup is usually reflected to some extent on all platforms. A JIT compiler can theoretically optimize the same code differently for different underlying CPUs, thus getting the best of all worlds. In tests by Mark Roulo http:www.javaworld.comjavaworldjw-09-1998jw-09-speed.html , he found that a good JIT speeded up the overhead of method calls from a best of 280 CPU clock cycles in the fastest non-JIT VM, to just 2 clock cycles in the JIT VM. In a direct comparison of method call times for this JIT VM compared to a compiled C++ program, the Java method call time was found to be just one clock cycle slower than the C++: fast enough for almost any application. However, object creation is not speeded up by anywhere near this amount, which means that with a JIT VM, object creation is relatively more expensive and consequently more important when tuning than with a non-JIT VM.

3.3.3 VM Startup Time