Use simulations and benchmarks Consider the total work done and the design overhead
13.4.5.2 Consider the relative costs of different types of accesses and updates
Accesses and updates to system memory are always going to be significantly faster than accesses and updates to other memory media. For example, reads from a local disk can be a thousand times slower than memory access, and disk writes are typically half as fast as disk reads. Random access of disks is significantly slower than sequential access. Recognizing these variations may steer your design to alternatives you might otherwise not have considered. For example, one application server that supports a shared persistent cache redesigned the persistent cache update mechanism to take account of these different update times GemStone application server, http:www.gemstone.com . The original architecture performed transactional updates to objects by writing the changes to the objects on the disk, which required random disk access and updates. The modified architecture wrote all changes to shared memory as well as to a sequential journaling log file for crash recovery. Another asynchronous process handled flushing the changes from shared memory to the objects stored on disk. Because disk navigation to the various objects was significant, this change in architecture improved performance by completely removing that bottleneck from the transaction .13.4.5.3 Use simulations and benchmarks
Ideally, you have a detailed simulation of your application that allows you to predict the performance under any set of conditions. More usually, you have a vague simulation that has some characteristics similar to your intended application. It is important to keep striving for the full detailed simulation to be able to predict the performance of the application. But since your resources are limited, you need to project measurements to come as close as possible to your target application. You should try to include loads and delays in your simulation that come close to the expected load of the application. Try to acquire the resources your finished application will use, even if those resources are not used in the simulation. For example, spawn as many threads as you expect the application to use, even if the threads do little more than sleep restlessly. [14] [14] Sleeping restlessly is calling Thread.sleep in a loop, with the sleep time set to some value that requires many loop iterations before the loop terminates. Other activities can be run intermittently in the loop to simulate work. Graphing the results from increasing various application-specific parameters allows you to predict the performance of the application under a variety of conditions. It is worth checking vendor or - 299 - standard benchmarks if you need some really basic statistics, but bear in mind that those benchmarks seldom have much relevance to a particular application.13.4.5.4 Consider the total work done and the design overhead
Try stripping your design to the bare essentials or going back to the specification. Consider how to create a special-purpose implementation that handles the specification for a specific set of inputs. This can give you an estimate of the actual work your application will do. Now consider your design and look at the overheads added by the design for each piece of functionality. This provides a good way to focus on the overheads and determine if they are excessive.13.4.5.5 Focus on shared resources
Parts
» OReilly.Java.performance tuning
» The Tuning Game System Limitations and What to Tune
» A Tuning Strategy Introduction
» Threading to Appear Quicker Streaming to Appear Quicker
» User Agreements Starting to Tune
» Setting Benchmarks Starting to Tune
» The Benchmark Harness Starting to Tune
» Taking Measurements Starting to Tune
» What to Measure Introduction
» Dont Tune What You Dont Need to Tune
» Measurements and Timings Profiling Tools
» Garbage Collection Profiling Tools
» Profiling Methodology Method Calls
» Java 2 cpu=samples Profile Output
» HotSpot and 1.3 -Xprof Profile Output
» JDK 1.1.x -prof and Java 2 cpu=old Profile Output
» Object-Creation Profiling Profiling Tools
» Monitoring Gross Memory Usage
» Replacing Sockets ClientServer Communications
» Performance Checklist Profiling Tools
» Garbage Collection Underlying JDK Improvements
» Replacing JDK Classes Underlying JDK Improvements
» VM Speed Variations VMs with JIT Compilers
» Other VM Optimizations Faster VMs
» Inline calls Remove dynamic type checks Unroll loops Code motion
» Literal constants are folded String concatenation is sometimes folded Constant fields are inlined
» Optimizations Performed When Using the -O Option
» Performance Effects From Runtime Options
» Compile to Native Machine Code
» Native Method Calls Underlying JDK Improvements
» Uncompressed ZIPJAR Files Underlying JDK Improvements
» Performance Checklist Underlying JDK Improvements
» Object-Creation Statistics Object Creation
» Pool Management Object Reuse
» Reusable Parameters Object Reuse
» String canonicalization Changeable objects
» Weak references Canonicalizing Objects
» Avoiding Garbage Collection Object Creation
» Preallocating Objects Lazy Initialization
» Performance Checklist Object Creation
» The Performance Effects of Strings
» Compile-Time Versus Runtime Resolution of Strings
» Converting bytes, shorts, chars, and booleans to Strings Converting floats to Strings
» Converting doubles to Strings
» Converting Objects to Strings
» Word-Counting Example Strings Versus char Arrays
» Line Filter Example HotSpot 1.0
» String Comparisons and Searches
» Sorting Internationalized Strings Strings
» The Cost of try-catch Blocks Without an Exception
» The Cost of try-catch Blocks with an Exception
» Using Exceptions Without the Stack Trace Overhead Conditional Error Checking
» no JIT 1.3 Variables Strings
» Method Parameters Performance Checklist
» Exception-Terminated Loops Loops and Switches
» no JIT 1.3 Loops and Switches
» Recursion Loops and Switches
» no HotSpot 1.0 2nd Loops and Switches
» Recursion and Stacks Loops and Switches
» Performance Checklist Loops and Switches
» Replacing System.out IO, Logging, and Console Output
» Logging From Raw IO to Smokin IO
» no JIT HotSpot 1.0 no JIT HotSpot 1.0 Serialization
» no IO, Logging, and Console Output
» Clustering Objects and Counting IO Operations
» Compression IO, Logging, and Console Output
» Performance Checklist IO, Logging, and Console Output
» Avoiding Unnecessary Sorting Overhead
» An Efficient Sorting Framework
» no HotSpot Better Than Onlogn Sorting
» User-Interface Thread and Other Threads
» Desynchronization and Synchronized Wrappers
» Avoiding Serialized Execution HotSpot 1.0
» no JIT no JIT HotSpot 1.0 Timing Multithreaded Tests
» Atomic Access and Assignment
» Free Load Balancing from TCPIP
» Load-Balancing Classes Load Balancing
» A Load-Balancing Example Load Balancing
» Threaded Problem-Solving Strategies Threading
» Collections Appropriate Data Structures and Algorithms
» Java 2 Collections Appropriate Data Structures and Algorithms
» Hashtables and HashMaps Appropriate Data Structures and Algorithms
» Cached Access Appropriate Data Structures and Algorithms
» Caching Example I Appropriate Data Structures and Algorithms
» Caching Example II Appropriate Data Structures and Algorithms
» Finding the Index for Partially Matched Strings
» Search Trees Appropriate Data Structures and Algorithms
» Comparing Communication Layers Distributed Computing
» Batching I Application Partitioning
» Compression Caching Low-Level Communication Optimizations
» Transfer Batching Low-Level Communication Optimizations
» Batching II Distributed Garbage Collection
» Performance Checklist Distributed Computing
» When Not to Optimize Tuning Class Libraries and Beans
» Scaling Design and Architecture
» Distributed Applications Design and Architecture
» Object Design Design and Architecture
» Use simulations and benchmarks Consider the total work done and the design overhead
» Tuning After Deployment When to Optimize
» User Interface Usability Training Server Downtime
» Performance Checklist When to Optimize
» Clustering Files Cached Filesystems RAM Disks, tmpfs, cachefs
» Disk Fragmentation Disk Sweet Spots
» RAM Underlying Operating System and Network Improvements
» Network Bottlenecks Network IO
» Performance Checklist Underlying Operating System and Network Improvements
Show more