Object Design Design and Architecture

- 296 - • Operating-system loads Tuning any component other than the current bottleneck gives no improvement. Peak performance of each component is rarely achieved. You need to assume average rates of performance from the underlying resource and expect performance based on those average rates. Distributed applications tend to exaggerate any performance characteristics. So when performance is bad, the application tends to slow significantly more than in nondistributed applications. The distributed design aspects should emphasize asynchronous and concurrent operations. Typical items to include in the design are: • Queues • Asynchronous communications and activities • Parallelizable activities • Minimized serialization points • Balanced workloads across multiple servers • Redundant servers and automatic switching capabilities • Activities that can be configured at runtime to run in different locations • Short transactions The key to good performance in a distributed application is to minimize the amount of communication necessary. Performance problems tend to be caused by too many messages flying back and forth between distributed components. Bells rule of networking applies: Money can buy bandwidth, but latency is forever. [13] [13] Thomas E. Bell, Performance of distributed systems, a paper presented at the ICCM Capacity Management Forum 7, San Francisco, October 1993. Unfortunately, communication overhead can be incurred by many different parts of a distributed application. There are some general high-level guidelines: • Allow the application to be partitioned according to the data and processing power. Any particular task should be able to run in several locations, and the location that provides the best performance should be chosen at runtime. Usually the best location for the task is where the data required for the task is stored, as transferring data tends to be a significant overhead. • Avoid generating distributed garbage. Distributed garbage collection can be a severe overhead on any distributed application. • Reduce the costs of keeping data synchronized by minimizing the duplication of data. • Reduce data-transfer costs by duplicating data. This conflicts directly with the last point, so the two techniques must be balanced to find the optimal data duplication points. • Cache distributed data wherever possible. • Use compression to reduce the time taken to transfer large amounts of data.

13.4.4 Object Design

My advice for object design is to use interfaces and interface-like patterns throughout the code. Although there are slightly higher runtime costs from using interfaces, that cost is well outweighed by the benefits of being able to replace one object implementation with another easily. Using interfaces means you can design with the option to replace any class or component with a faster one. Consider also where the design requires comparison by identity or by equality and where these choices can be made at implementation time. - 297 - The JDK classes are not all designed with interfaces. Those JDK classes and other third-party classes that do not have interface definitions should be wrapped by your own classes so that their use can be made more generic. Applications that need to minimize download time, such as applets, may need to avoid the extra overhead that wrapping causes . Object creation is one significant place where interfaces fall down, since interfaces do not support constructor declarations, and constructors cannot return an object of a different class. To handle object creation in a way similar to interfaces, you should use the factory pattern. The factory design pattern recommends that object creation be centralized in a particular factory method . So rather than calling new Something when you want to create an instance of the Something class, you call a method such as SomethingFactory.getNewSomething , which creates and returns a new instance of the Something class. Again, this pattern has performance costs, as there is the overhead of an extra method call for every object creation, but the pattern provides more flexibility when it comes to tuning. Design for reusable objects: do not unnecessarily throw away objects. The factory design pattern can help, as it supports the recycling of objects. Canonicalize objects where possible see Section 4.2.4 . Keep in mind that stateless objects can usually be safely shared, so try to move to stateless objects where appropriate. Using stateless objects is a good way to support changing algorithms easily, by implementing different algorithms in particular types of objects. For example, see Section 9.2 , where different sorting algorithms are implemented in various sorting classes. The resulting objects can be interchanged whenever the sorting algorithm needs to be varied. Consider whether to optimize objects for update or access. For example, a statistics-calculating object might update its average and standard deviation each time a value is added to it, thus slowing down updates but making access of those statistics lightning-fast. Or, the object could simply store added values and calculate the average and standard deviation each time those statistics are accessed, making the update as fast as possible, but increasing the time for statistics access.

13.4.5 Techniques for Predicting Performance