- 71 -
return 55; }
}
No reference is left to class
A
, and no
if
statement is left. The consequence of this feature is to allow conditional compilation. Other classes can set a
DEBUG
constant in their own class the same way, or they can use a shared constant value as class
B
used
A.DEBUG
in the earlier definition. A problem is frequently encountered with this kind of code. The constant value is set
when the class with the constant, say class
A
, is compiled. Any other class referring to class
A
s constant takes the value that is currently set when that class is being compiled, and does not reset the value if
A
is recompiled. So you can have the situation when
A
is compiled with
A.DEBUG
set to
false
, then
B
is compiled and the compiler inlines
A.DEBUG
as
false
, possibly cutting dead code branches. Then if
A
is recompiled to set
A.DEBUG
to
true
, this does not affect class
B
; the compiled class
B
still has the value
false
inlined, and any dead code branches stay eliminated until class
B
is recompiled. You should be aware of this possible problem if you compile your classes in more than
one compilation pass.
You should use this pattern for debug and trace statements , and assertion preconditions , postconditions, and invariants . There is more detail on this technique in
Section 6.1.4 .
3.5.2 Optimizations Performed When Using the -O Option
The only standard compile-time option that can improve performance with the JDK compiler is the
-O
option . Note that
-O
for Optimize is a common option for compilers, and further optimizing options for other compilers often take the form
-O1
,
-O2
, etc. You should always check your compilers documentation to find out what other options are available and what they do. Some
compilers allow you to make the choice between optimizing the compiled code for speed or minimizing the size; there is often a tradeoff between these two aspects.
The standard
-O
option does not currently apply a variety of optimizations in the Sun JDK up to JDK 1.2. In future versions it may do more. Currently, the option makes the compiler eliminate
optional tables in the .class files , such as line number and local variable tables; this gives only a small performance improvement by making class files smaller and therefore quicker to load. You
should definitely use this option if your class files are sent across a network.
But the main performance improvement of using the
-O
option comes from the compiler inlining methods. When using the
-O
option, the compiler considers inlining methods defined with any of the following modifiers:
private
,
static
, or
final
. Some methods, such as those defined as
synchronized,
are never inlined. If a method can be inlined, the compiler decides whether or not to inline it depending on its own unpublished considerations. These considerations seem mainly to
be the simplicity of the method: in JDK 1.2 the compiler inlines only fairly simple methods. For example, one-line methods with no side effects, such as accessing or updating a variable, are
invariably inlined. Methods that return just a constant are also inlined. Multiline methods are inlined if the compiler determines they are simple enough e.g., a
System.out.printlnblah
followed by a
return
statement would get inlined.
Why There Are Limits on Inlining
The compiler can inline only those methods that can be statically bound at compile time.
- 72 - To see why, consider the following example of class
A
and its subclass
B
, with two methods defined,
foo1
and
foo2
. The
foo2
method is overridden in the subclass:
class A { public int foo1 {return foo2 ;}
public int foo2 {return 5;} }
public class B extends A { public int foo2 {return 10;}
}
If
A.foo2
is inlined into
A.foo1
,
new B .foo1
incorrectly returns 5 instead of 10, because
A
is compiled incorrectly as if it read:
class A { public int foo1 {return 5;}
public int foo2 {return 5;} }
Any method that can be overridden at runtime cannot be validly inlined it is a potential bug if it is. The Java specification states that
final
methods can be non-
final
at runtime, i.e., you can compile a set of classes with one class having a
final
method, but later recompile that class without the method as
final
thus allowing subclasses to override it, and the other classes must run correctly. For this reason, not all
final
methods can be identified as statically bound at compile time, so not all
final
methods can be inlined. Some earlier compiler versions incorrectly inlined some
final
methods, and I have seen serious bugs caused by this.
Choosing simple methods to inline does have a rationale behind it. The larger the method being inlined, the more the code gets bloated with copies of the same code being inserted in many places.
This has runtime costs in extra code being loaded and extra space taken by the runtime system. A JIT VM would also have the extra cost of having to compile more code. At some point, there is a
decrease in performance from inlining too much code. In addition, some methods have side effects that can make them quite difficult to inline correctly.
The compiler applies its methodology for selecting methods to inline, irrespective of whether the target method is in a bottleneck: this is a machine-gun strategy of many little optimizations in the
hope that some inline calls may improve the bottlenecks. A performance tuner applying inlining works the other way around, first finding the bottlenecks, then selectively inlining methods inside
bottlenecks. This latter strategy can result in good speedups, especially in loop bottlenecks. This is because a loop can be speeded up significantly by removing the overhead of a repeated method call.
If the method to be inlined is complex, you can often factor out parts of the method so that those parts can be executed outside the loop, gaining even more speedup.
I have not found any public document that specifies the actual decision-making process that determines whether or not a method is inlined. The only reference given is to Section 13.4.21 of The
Java Language Specification that specifies only that binary compatibility with preexisting binaries must be maintained. It does specify that the package must be guaranteed to be kept together for the
compiler to allow inlining across classes. The specification also states that the
final
keyword does not imply that a method can be inlined, as the runtime system may have a differently implemented
method.
- 73 - Prior to JDK 1.2, the
-O
option used with the Sun compiler did inline methods across classes, even if they were not compiled in the same compilation pass. This behavior led to bugs.
[8]
From JDK 1.2, the
-O
option no longer inlines methods across classes, even if they are compiled in the same compilation pass.
[8]
Primarily methods that accessed private or protected variables were incorrectly inlined into other classes, leading to runtime authorization exceptions.
Unfortunately, there is no way to directly specify which methods should be inlined, rather than relying on the compilers internal workings. I guess that in the future, some compiler vendors will
provide a mechanism that supports specifying which methods to inline, along with other preprocessor options. In the meantime, you can implement a preprocessor or use an existing one if
you require tighter control. Opportunities for inlining often occur inside bottlenecks especially in loops, as discussed previously. Selective inlining by hand can give an order-of-magnitude speedup
for some bottlenecks and no speedup at all in others.
The speedup obtained purely from inlining is usually only a few percent: 5 is fairly common. Some optimizing compilers are very aggressive about inlining code. They apply techniques such as
analyzing the entire program to alter and eliminate method calls in order to identify methods that can be coerced into being statically bound. Then these identified methods are inlined as much as
possible according to the compilers analysis. This technique has been shown to give a 50 speedup to some applications. Another inlining technique used is that by the HotSpot runtime, which
aggressively inlines code after a bottleneck has been identified.
3.5.3 Performance Effects From Runtime Options