Optimizations Performed When Using the -O Option

- 71 - return 55; } } No reference is left to class A , and no if statement is left. The consequence of this feature is to allow conditional compilation. Other classes can set a DEBUG constant in their own class the same way, or they can use a shared constant value as class B used A.DEBUG in the earlier definition. A problem is frequently encountered with this kind of code. The constant value is set when the class with the constant, say class A , is compiled. Any other class referring to class A s constant takes the value that is currently set when that class is being compiled, and does not reset the value if A is recompiled. So you can have the situation when A is compiled with A.DEBUG set to false , then B is compiled and the compiler inlines A.DEBUG as false , possibly cutting dead code branches. Then if A is recompiled to set A.DEBUG to true , this does not affect class B ; the compiled class B still has the value false inlined, and any dead code branches stay eliminated until class B is recompiled. You should be aware of this possible problem if you compile your classes in more than one compilation pass. You should use this pattern for debug and trace statements , and assertion preconditions , postconditions, and invariants . There is more detail on this technique in Section 6.1.4 .

3.5.2 Optimizations Performed When Using the -O Option

The only standard compile-time option that can improve performance with the JDK compiler is the -O option . Note that -O for Optimize is a common option for compilers, and further optimizing options for other compilers often take the form -O1 , -O2 , etc. You should always check your compilers documentation to find out what other options are available and what they do. Some compilers allow you to make the choice between optimizing the compiled code for speed or minimizing the size; there is often a tradeoff between these two aspects. The standard -O option does not currently apply a variety of optimizations in the Sun JDK up to JDK 1.2. In future versions it may do more. Currently, the option makes the compiler eliminate optional tables in the .class files , such as line number and local variable tables; this gives only a small performance improvement by making class files smaller and therefore quicker to load. You should definitely use this option if your class files are sent across a network. But the main performance improvement of using the -O option comes from the compiler inlining methods. When using the -O option, the compiler considers inlining methods defined with any of the following modifiers: private , static , or final . Some methods, such as those defined as synchronized, are never inlined. If a method can be inlined, the compiler decides whether or not to inline it depending on its own unpublished considerations. These considerations seem mainly to be the simplicity of the method: in JDK 1.2 the compiler inlines only fairly simple methods. For example, one-line methods with no side effects, such as accessing or updating a variable, are invariably inlined. Methods that return just a constant are also inlined. Multiline methods are inlined if the compiler determines they are simple enough e.g., a System.out.printlnblah followed by a return statement would get inlined. Why There Are Limits on Inlining The compiler can inline only those methods that can be statically bound at compile time. - 72 - To see why, consider the following example of class A and its subclass B , with two methods defined, foo1 and foo2 . The foo2 method is overridden in the subclass: class A { public int foo1 {return foo2 ;} public int foo2 {return 5;} } public class B extends A { public int foo2 {return 10;} } If A.foo2 is inlined into A.foo1 , new B .foo1 incorrectly returns 5 instead of 10, because A is compiled incorrectly as if it read: class A { public int foo1 {return 5;} public int foo2 {return 5;} } Any method that can be overridden at runtime cannot be validly inlined it is a potential bug if it is. The Java specification states that final methods can be non- final at runtime, i.e., you can compile a set of classes with one class having a final method, but later recompile that class without the method as final thus allowing subclasses to override it, and the other classes must run correctly. For this reason, not all final methods can be identified as statically bound at compile time, so not all final methods can be inlined. Some earlier compiler versions incorrectly inlined some final methods, and I have seen serious bugs caused by this. Choosing simple methods to inline does have a rationale behind it. The larger the method being inlined, the more the code gets bloated with copies of the same code being inserted in many places. This has runtime costs in extra code being loaded and extra space taken by the runtime system. A JIT VM would also have the extra cost of having to compile more code. At some point, there is a decrease in performance from inlining too much code. In addition, some methods have side effects that can make them quite difficult to inline correctly. The compiler applies its methodology for selecting methods to inline, irrespective of whether the target method is in a bottleneck: this is a machine-gun strategy of many little optimizations in the hope that some inline calls may improve the bottlenecks. A performance tuner applying inlining works the other way around, first finding the bottlenecks, then selectively inlining methods inside bottlenecks. This latter strategy can result in good speedups, especially in loop bottlenecks. This is because a loop can be speeded up significantly by removing the overhead of a repeated method call. If the method to be inlined is complex, you can often factor out parts of the method so that those parts can be executed outside the loop, gaining even more speedup. I have not found any public document that specifies the actual decision-making process that determines whether or not a method is inlined. The only reference given is to Section 13.4.21 of The Java Language Specification that specifies only that binary compatibility with preexisting binaries must be maintained. It does specify that the package must be guaranteed to be kept together for the compiler to allow inlining across classes. The specification also states that the final keyword does not imply that a method can be inlined, as the runtime system may have a differently implemented method. - 73 - Prior to JDK 1.2, the -O option used with the Sun compiler did inline methods across classes, even if they were not compiled in the same compilation pass. This behavior led to bugs. [8] From JDK 1.2, the -O option no longer inlines methods across classes, even if they are compiled in the same compilation pass. [8] Primarily methods that accessed private or protected variables were incorrectly inlined into other classes, leading to runtime authorization exceptions. Unfortunately, there is no way to directly specify which methods should be inlined, rather than relying on the compilers internal workings. I guess that in the future, some compiler vendors will provide a mechanism that supports specifying which methods to inline, along with other preprocessor options. In the meantime, you can implement a preprocessor or use an existing one if you require tighter control. Opportunities for inlining often occur inside bottlenecks especially in loops, as discussed previously. Selective inlining by hand can give an order-of-magnitude speedup for some bottlenecks and no speedup at all in others. The speedup obtained purely from inlining is usually only a few percent: 5 is fairly common. Some optimizing compilers are very aggressive about inlining code. They apply techniques such as analyzing the entire program to alter and eliminate method calls in order to identify methods that can be coerced into being statically bound. Then these identified methods are inlined as much as possible according to the compilers analysis. This technique has been shown to give a 50 speedup to some applications. Another inlining technique used is that by the HotSpot runtime, which aggressively inlines code after a bottleneck has been identified.

3.5.3 Performance Effects From Runtime Options