Tuning After Deployment When to Optimize

- 299 - standard benchmarks if you need some really basic statistics, but bear in mind that those benchmarks seldom have much relevance to a particular application.

13.4.5.4 Consider the total work done and the design overhead

Try stripping your design to the bare essentials or going back to the specification. Consider how to create a special-purpose implementation that handles the specification for a specific set of inputs. This can give you an estimate of the actual work your application will do. Now consider your design and look at the overheads added by the design for each piece of functionality. This provides a good way to focus on the overheads and determine if they are excessive.

13.4.5.5 Focus on shared resources

Shared resources almost always cause performance problems if they have not been designed to optimize performance. Ensure that any simulation correctly simulates the sharing of resources, and use prediction analyses such as those in Section 13.4.1.3 earlier in this chapter to predict the behavior of multiple objects using shared resources.

13.4.5.6 Predict the effects of parallelism

Consider what happens when your design is spread over multiple threads, processes, CPUs, machines, etc. This analysis can be quite difficult without a simulation and test bed, but it can help to identify whether the design limits the use of parallelism.

13.4.5.7 Assess the costs of data conversions

Many applications convert data between different types e.g., between strings and numbers. From your design, you should be able to determine the frequency and types of data conversions, and it is fairly simple to create small tests that determine the costs of the particular conversions you are using. Dont forget to include any concurrency or use of shared resources in the tests. Remember that external transfer of objects or data normally includes some data conversions. The cost of data conversion may be significant enough to direct you to alter your design.

13.4.5.8 Determine whether batch processing is faster

Some repeated tasks can be processed as a batch instead of one at a time. Batch processing can take advantage of a number of efficiencies, such as accessing and creating some objects just once, eliminating some tests for shared resources, processing tasks in optimal order, avoiding repeated searches, etc. If any particular set of tasks could be processed in batch mode, consider the effect this would have on your application and how much faster the processing could be. The simplest conceptual example is that of adding characters one by one to a StringBuffer , as opposed to using a char array to add all the characters together. Adding the characters using a char array is much faster for any significant number of characters.

13.5 Tuning After Deployment

Tuning does not necessarily end at the development stage. For many applications such as agent applications, services, servlets and servers, multiuser applications, enterprise systems, etc., there needs to be constant monitoring of the application performance after deployment to ensure that no degradation takes place. In this section, I discuss tuning the deployed application. This is mainly - 300 - relevant to enterprise systems that are being administered. Shrink-wrapped or similar software is normally tuned the same way as before deployment, using standard profiling tools. Monitoring the application is the primary tuning activity after deployment . The application should be built with hooks that enable tools to connect to it and gather statistics and response times . The application should be constantly monitored, and all performance logs retained. Monitoring should record as many parameters as possible throughout the system, though clearly you want to avoid monitoring so much that the performance of the running application is compromised by a significant amount. Of course, almost any act of measuring a system affects performance. But the advantage of having performance logs normally pays off enormously, and a few percent decrease in performance should be acceptable. Individual records in the performance logs should include at least the following six categories: • Time including offset time from a reference server • User identifier • Transaction identifier • Application name, type, class, or group • Software component or subsystem • Hardware resource A standard set of performance logs should be used to give a background system measurement and kept as a reference set of logs. Other logs can be compared against that standard. Periodically, the standard should be regenerated, as most enterprise applications change their performance characteristics over time. Ideally, the standard logs can be automatically compared against the current logs, and any significant change in behavior is automatically identified and causes an alert to be sent to the administrators. Trends away from the standard should also trigger a notification; sometimes performance degrades slowly but consistently because of a gradually depleting resource. Administrators should note every single change to the system, every patch, every upgrade, every configuration change, etc. These changes are the source of most performance problems in production. Patches are cheaper short-term fixes than upgrades, but they usually add to the complexity of the application and increase maintenance costs. Upgrades and rereleases are more expensive in the short term, but cheaper overall. Administrators should listen to the application users. Users are the most sensitive barometer of application performance. However, you should double-check users assertions. A user may be wrong, or might have hit a known system problem or temporary administrative shutdown. Measure the performance yourself. Repeat the measurements several times and take averages and variations. Ensure that caching effects do not skew measurements of a reported problem. When looking for reasons why performance may have changed, consider any recent changes such as an increase in the number of users, other applications added to system, code changes on the client or server, hardware changes, etc. In addition to user response time measurements, look at where the distributed code is executing, what volumes of data are being used, and where the code is spending most of its time. Many factors can easily give misleading or temporarily different measurements to the application. Distributed garbage collection may have cut in, system clocks may become unsynchronized, background processes may be triggered, and relative processor power may change, causing obscure effects. Consider if anyone else is using the processors, and if so, what they are doing and why. - 301 - You need to differentiate between: • Occasional sudden slowness, e.g., from background processes starting up • General slowness, perhaps reflecting that the application was not tuned for the current load, or that the systems or networks are saturated • A sudden slowdown that continues, often the result of a change to the system Each of these characteristic changes in performance indicate a different set of problems.

13.6 More Factors That Affect Performance