User Agreements Starting to Tune

- 14 - This situation is often the case for distributed applications. A well-known example is again found in web browsers that display the initial screenful of a page as soon as it is available, without waiting for the whole page to be downloaded. The general case is when you have a long activity that can provide results in a stream, so that the results can be accessed a few at a time. For distributed applications, sending all the data is often what takes a long time; in this case, you can build streaming into the application by sending one screenful of data at a time. Also, bear in mind that when there is a really large amount of data to display, the user often views only some of it and aborts, so be sure to build in the ability to stop the stream and restore its resources at any time.

1.5.3 Caching to Appear Quicker

This section briefly covers the general principles of caching. Caching is an optimization technique I return to in several different sections of this book, when it is appropriate to the problem under discussion. For example, in the area of disk access, there are several caches that apply: from the lowest-level hardware cache up through the operating-system disk read and write caches, cached filesystems, and file reading and writing classes that provide buffered IO. Some caches cannot be tuned at all; others are tuneable at the operating-system level or in Java. Where it is possible for a developer to take advantage of or tune a particular cache, I provide suggestions and approaches that cover the caching technique appropriate to that area of the application. In some cases where caches are not directly tuneable, it is still worth knowing the effect of using the cache in different ways and how this can affect performance. For example, disk hardware caches almost always apply a read- ahead algorithm : the cache is filled with the next block of data after the one just read. This means that reading backward through a file in chunks is not as fast as reading forward through the file. Caches are effective because it is expensive to move data from one place to another or to calculate results. If you need to do this more than once to the same piece of data, it is best to hang on to it the first time and refer to the local copy in the future. This applies, for example, to remote access of files such as browser downloads. The browser caches locally on disk the file that was downloaded, to ensure that a subsequent access does not have to reach across the network to reread the file, thus making it much quicker to access a second time. It also applies, in a different way, to reading bytes from the disk. Here, the cost of reading one byte for operating systems is the same as reading a page usually 4 or 8 KB, as data is read into memory a page at a time by the operating system. If you are going to read more than one byte from a particular disk area, it is better to read in a whole page or all the data if it fits on one page and access bytes through your local copy of the data. General aspects of caching are covered in more detail in the section Section 11.4 . Caching is an important performance-tuning technique that trades space for time, and it should be used whenever extra memory space is available to the application.

1.6 Starting to Tune

Before diving into the actual tuning, there are a number of considerations that will make your tuning phase run more smoothly and result in clearly achieved objectives.

1.6.1 User Agreements

Any application must meet the needs and expectations of its users, and a large part of those needs and expectations is performance. Before you start tuning, it is crucial to identify the target response times for as much of the system as possible. At the outset, you should agree with your users directly if you have access to them, or otherwise through representative user profiles, market information, etc. what the performance of the application is expected to be. - 15 - The performance should be specified for as many aspects of the system as possible, including: • Multiuser response times depending on the number of users if applicable • Systemwide throughput e.g., number of transactions per minute for the system as a whole, or response times on a saturated network, again if applicable • The maximum number of users, data, files, file sizes, objects, etc., the application supports • Any acceptable and expected degradation in performance between minimal, average, and extreme values of supported resources Agree on target values and acceptable variances with the customer or potential users of the application or whoever is responsible for performance before starting to tune. Otherwise, you will not know where to target your effort, how far you need to go, whether particular performance targets are achievable at all, and how much tuning effort those targets may require. But most importantly, without agreed targets, whatever you achieve tends to become the starting point. The following scenario is not unusual: a manager sees horrendous performance, perhaps a function that was expected to be quick, but takes 100 seconds. His immediate response is, Good grief, I expected this to take no more than 10 seconds. Then, after a quick round of tuning that identifies and removes a huge bottleneck, function time is down to 10 seconds. The managers response is now, Ah, thats more reasonable, but of course I actually meant to specify 3 seconds—I just never believed you could get down so far after seeing it take 100 seconds. Now you can start tuning. You do not want your initial achievement to go unrecognized especially if money depends on it, and it is better to know at the outset what you need to reach. Agreeing on targets before tuning makes everything clear to everyone.

1.6.2 Setting Benchmarks