Uncompressed ZIPJAR Files Underlying JDK Improvements

- 75 - [9] Serious number crunchers spend a large proportion of their time performance-tuning their code, whatever the language it is written in. To gain sufficient performance in Java, they of course need to intensively tune the application. But this is also true if the application is written in C or Fortran. The amount of tuning required is now, apparently, similar for these three languages. Further information can be found at http:www.javagrande.org . The JNI interface itself has its own overhead, which means that if a pure Java implementation comes close to the native call performance, the JNI overhead will probably cancel any performance advantages from the native call. However, on occasion the underlying system can provide an optimized native call that is not available from Java and cannot be implemented to work as fast in pure Java. In this kind of situation, JNI is useful for tuning. Another case in which JNI can be useful is reducing the numbers of objects created, though this should be less common: you should normally be able to do this directly in Java. I once encountered a situation where JNI was needed to avoid excessive objects. This was with an application that originally required the use of a native DLL service. The vendor of that DLL ported the service to Java, which the application developers would have preferred using, but unfortunately the vendor neglected to tune the ported code. This resulted in the situation where a native call to a particular set of services produced just a couple dozen objects, but the Java-ported code produced nearly 10,000 objects. Apart from this difference, the speeds of the two implementations were similar. [10] However, the overhead in garbage collection caused a significant degradation in performance, which meant that the native call to the DLL was the preferred option. [10] This increase in object creation normally results in a much slower implementation. However, in this particular case, the methods required synchronizing to a degree that gave a larger overhead than the object creation. Nevertheless, the much larger number of objects created by the untuned Java implementation needed reclaiming at some point, and this led to greater overhead in the garbage collection. If you are following the native function call route, there is little to say. You write your routines in C, plug them into your application using the native keyword as specified in the Java development kit, profile the resultant application, and confirm that it provides the required speedup. You can also use C or C++ or whatever profilers to profile the native code calls if it is complicated. Other than this, the only recommendation that applies here is that if you are calling the native routines from loops, you should move the loops down into the native routines and pass the loop parameters to the routine as arguments. This usually produces faster implementations . One other recommendation, which is not performance tuning-specific, is that it is usually good practice to provide a fallback methodology for situations when the native code cannot be loaded. This requires extra maintenance two sets of code, extra fallback code but is often worth the effort. You can manage the fallback at the time when the DLL library is being loaded by catching the exception when the load fails and providing an alternative path to the fallback code, either by setting boolean switches or by instantiating objects of the appropriate fallback classes as required.

3.8 Uncompressed ZIPJAR Files

It is better to deliver your classes in a ZIP or JAR file than to deliver them one class at a time over the network or load them individually from separate files in the filesystem. This packaged delivery provides some of the benefits of clustering [11] see Section 14.1.2 . The benefits gained from packaging class files come from reducing IO overheads such as repeated file opening and closing, and possibly improving seek times. [12] Within the ZIP or JAR file, the classes should not be compressed unless network download time is a factor for the application. The best way to deliver local classes for performance reasons is in an uncompressed ZIP or JAR file. Coincidentally, thats how theyre delivered with the JDK. [11] Clustering is an unfortunately overloaded word, and is often used to refer to closely linked groups of server machines. In the context here, I use clustering to mean the close grouping of files. - 76 - [12] With operating system-monitoring tools, you can see the system temporarily stalling when the operating system issues a disk-cache flush if lots of files are closed close together in time. If you use a single packed file for all classes and resources, you avoid this potential performance hit. It is possible to further improve the classloading times by packing the classes into the ZIPJAR file in the order in which they are loaded by the application . You can determine the loading order by running the application with the -verbose option, but note that this ordering is fragile: slight changes in the application can easily alter the loading order of classes. A further extension to this idea is to include your own classloader that opens the ZIPJAR file itself and reads in all files sequentially, loading them into memory immediately. Perhaps the final version of this performance improvement route is to dispense with the ZIPJAR filesystem: it is quicker to load the files if they are concatenated together in one big file, with a header at the start of the file giving the offsets and names of the contained files. This is similar to the ZIP filesystem, but it is better if you read the header in one block, and read in and load the files directly rather than going through the java.util.zip classes. One further optimization to this classloading tactic is to start the classloader running in a separate low-priority thread immediately after VM startup .

3.9 Performance Checklist