no JIT 1.3 Variables Strings
1.2 no JIT 1.3
HotSpot 1.0 1.1.6 Increase in time None 10 20 ~5 None However, the cost of an object type cast is not constant: it depends on the depth of the hierarchy and whether the casting type is an interface or a class. Interfaces are generally more expensive to use in casting, and the further back in the hierarchy and ordering of interfaces in the class definition, the longer the cast takes to execute. Remember, though: never change the design of the application for minor performance gains. It is best to avoid casts whenever possible, for example by creating and using type-specific collection classes instead of using generic collection classes. Rather than use a standard List to store a list of String s, you gain better performance by creating and using a StringList class. You should always try to type the variable as precisely as possible. In Chapter 9 , you can see that by rewriting a sort implementation to eliminate casts, the sorting time can be halved. If a variable needs casting several times, cast once and save the object into a temporary variable of the cast type. Use that temporary instead of repeatedly casting; avoid the following kind of code: if obj instanceof Something return Somethingobj.x + Somethingobj.y + Somethingobj.z; ... Instead, use a temporary: [4] [4] This is a special case of common subexpression elimination. See Section 3.4.2.14 . if obj instanceof Something { - 141 - Something something = Something obj; return something.x + something.y + something.z; } ... The revised code is also more readable. In tight loops, you may need to evaluate the cost of repeatedly assigning values to a temporary variable see Chapter 7 .6.3 Variables
Local temporary variables and method-argument variables are the fastest variables to access and update. Local variables remain on the stack, so they can be manipulated directly; the manipulation of local variables depends on both the VM and underlying machine implementation. Heap variables static and instance variables are manipulated in heap memory through the Java VM-assigned bytecodes that apply to these variables. There are special bytecodes for accessing the first four local variables and parameters on a method stack. Arguments are counted first; then, if there are less than four passed arguments, local variables are counted. For nonstatic methods, this always takes the first slot. long s and double s each take two slots. Theoretically, this means that methods with no more than three parameters and local variables combined four for static methods should be slightly faster than equivalent methods with a larger number of parameters and local variables. This also means that any variables allocated the special bytecodes should be slightly faster to manipulate. In practice, I have found any effect is small or negligible, and it is not worth the effort involved to limit the number of arguments and variables. Instance and static variables can be up to an order of magnitude slower to operate on when compared to method arguments and local variables. You can see this clearly with a simple test comparing local and static loop counters: package tuning.exception; public class VariableTest2 { static int cntr; public static void mainString[] args { int REPEAT = 500000000; int tot = 0; long time = System.currentTimeMillis ; for int i = -REPEAT; i 0; i++ tot += i; time = System.currentTimeMillis - time; System.out.printlnLoop local took + time; tot = 0; time = System.currentTimeMillis ; for cntr = -REPEAT; cntr 0; cntr++ tot += cntr; time = System.currentTimeMillis - time; System.out.printlnLoop static took + time; } } Running this test results in the second loop taking several times longer than the first loop see Table 6-4 . - 142 - Table 6-4, The Cost of Nonlocal Loop Variables Relative to Local Variables Times Relative to Loop Local Variables 1.21.2 no JIT 1.3
Parts
» OReilly.Java.performance tuning
» The Tuning Game System Limitations and What to Tune
» A Tuning Strategy Introduction
» Threading to Appear Quicker Streaming to Appear Quicker
» User Agreements Starting to Tune
» Setting Benchmarks Starting to Tune
» The Benchmark Harness Starting to Tune
» Taking Measurements Starting to Tune
» What to Measure Introduction
» Dont Tune What You Dont Need to Tune
» Measurements and Timings Profiling Tools
» Garbage Collection Profiling Tools
» Profiling Methodology Method Calls
» Java 2 cpu=samples Profile Output
» HotSpot and 1.3 -Xprof Profile Output
» JDK 1.1.x -prof and Java 2 cpu=old Profile Output
» Object-Creation Profiling Profiling Tools
» Monitoring Gross Memory Usage
» Replacing Sockets ClientServer Communications
» Performance Checklist Profiling Tools
» Garbage Collection Underlying JDK Improvements
» Replacing JDK Classes Underlying JDK Improvements
» VM Speed Variations VMs with JIT Compilers
» Other VM Optimizations Faster VMs
» Inline calls Remove dynamic type checks Unroll loops Code motion
» Literal constants are folded String concatenation is sometimes folded Constant fields are inlined
» Optimizations Performed When Using the -O Option
» Performance Effects From Runtime Options
» Compile to Native Machine Code
» Native Method Calls Underlying JDK Improvements
» Uncompressed ZIPJAR Files Underlying JDK Improvements
» Performance Checklist Underlying JDK Improvements
» Object-Creation Statistics Object Creation
» Pool Management Object Reuse
» Reusable Parameters Object Reuse
» String canonicalization Changeable objects
» Weak references Canonicalizing Objects
» Avoiding Garbage Collection Object Creation
» Preallocating Objects Lazy Initialization
» Performance Checklist Object Creation
» The Performance Effects of Strings
» Compile-Time Versus Runtime Resolution of Strings
» Converting bytes, shorts, chars, and booleans to Strings Converting floats to Strings
» Converting doubles to Strings
» Converting Objects to Strings
» Word-Counting Example Strings Versus char Arrays
» Line Filter Example HotSpot 1.0
» String Comparisons and Searches
» Sorting Internationalized Strings Strings
» The Cost of try-catch Blocks Without an Exception
» The Cost of try-catch Blocks with an Exception
» Using Exceptions Without the Stack Trace Overhead Conditional Error Checking
» no JIT 1.3 Variables Strings
» Method Parameters Performance Checklist
» Exception-Terminated Loops Loops and Switches
» no JIT 1.3 Loops and Switches
» Recursion Loops and Switches
» no HotSpot 1.0 2nd Loops and Switches
» Recursion and Stacks Loops and Switches
» Performance Checklist Loops and Switches
» Replacing System.out IO, Logging, and Console Output
» Logging From Raw IO to Smokin IO
» no JIT HotSpot 1.0 no JIT HotSpot 1.0 Serialization
» no IO, Logging, and Console Output
» Clustering Objects and Counting IO Operations
» Compression IO, Logging, and Console Output
» Performance Checklist IO, Logging, and Console Output
» Avoiding Unnecessary Sorting Overhead
» An Efficient Sorting Framework
» no HotSpot Better Than Onlogn Sorting
» User-Interface Thread and Other Threads
» Desynchronization and Synchronized Wrappers
» Avoiding Serialized Execution HotSpot 1.0
» no JIT no JIT HotSpot 1.0 Timing Multithreaded Tests
» Atomic Access and Assignment
» Free Load Balancing from TCPIP
» Load-Balancing Classes Load Balancing
» A Load-Balancing Example Load Balancing
» Threaded Problem-Solving Strategies Threading
» Collections Appropriate Data Structures and Algorithms
» Java 2 Collections Appropriate Data Structures and Algorithms
» Hashtables and HashMaps Appropriate Data Structures and Algorithms
» Cached Access Appropriate Data Structures and Algorithms
» Caching Example I Appropriate Data Structures and Algorithms
» Caching Example II Appropriate Data Structures and Algorithms
» Finding the Index for Partially Matched Strings
» Search Trees Appropriate Data Structures and Algorithms
» Comparing Communication Layers Distributed Computing
» Batching I Application Partitioning
» Compression Caching Low-Level Communication Optimizations
» Transfer Batching Low-Level Communication Optimizations
» Batching II Distributed Garbage Collection
» Performance Checklist Distributed Computing
» When Not to Optimize Tuning Class Libraries and Beans
» Scaling Design and Architecture
» Distributed Applications Design and Architecture
» Object Design Design and Architecture
» Use simulations and benchmarks Consider the total work done and the design overhead
» Tuning After Deployment When to Optimize
» User Interface Usability Training Server Downtime
» Performance Checklist When to Optimize
» Clustering Files Cached Filesystems RAM Disks, tmpfs, cachefs
» Disk Fragmentation Disk Sweet Spots
» RAM Underlying Operating System and Network Improvements
» Network Bottlenecks Network IO
» Performance Checklist Underlying Operating System and Network Improvements
Show more