A Little Bit of Bias

be flowing across the network. This means that 1 there is less bandwidth available to any given application, and 2 using lots of bandwidth impacts the performance of all the distributed programs on the network, not just the one using lots of bandwidth. To demonstrate this, try the following experiment: 1. Clear your web browsers cache and go to a static web site with a lot of images. 2. Click your browsers Reload button. The difference in speed between the first and second viewings of the web page is mostly due to the difference between having the images cached on your local hard drive versus downloading them across the network. You still may have to download the text in the page. But most of the images should be cached by your web browser. In other words, the difference is mostly due to network latency. Clearly, if communicating between two programs across a network is expensive, then a well- designed application needs to somehow account for this, minimizing the number of calls made across the network, the amount of data sent across the network, and the time the user has to wait because of network latency. Minimizing the number of calls, the amount of data sent, and the time the user must wait because of network latency are actually three different, and sometimes conflicting, goals. For example, using compression may reduce the amount of data sent over the network but may result in the user waiting longer because of the time it takes to uncompress the data.

Chapter 6. Deciding on the Remote Server

In Chapt er 5 , we briefly discussed the architecture of the bank example. In addition, we discussed the fundamental problems that arise when building distributed applications. In this chapter, I build on that discussion by introducing a set of basic evaluation criteria that will help you refine designs and choose between various design options.

6.1 A Little Bit of Bias

Good code invariably has small methods and small objects...no one thing I do to systems provides as much help as breaking it into more pieces ™Kent Beck, Smalltalk Best Practice Patterns The experienced distributed systems programmer will notice a certain bias in this chapter [ 1] towards what I call small-scale, semi-independent servers. The small-scale part of this is easy to explain. By and large, I build servers with very limited functionality as little as is reasonable, given the restrictions imposed by the fact that were building a distributed system. Then, I tend to give them large interfaces, exposing the same functionality in multiple ways. [ 1] To be honest, the bias permeates the rest of the boo k, too. If I didnt have opinions, I wouldnt be an author. As far as I know, theres no knockdown argument in favor of this style of designing and building programs. Many programmers who have built object-oriented systems tend to agree with Kent Beck. [ 2] In my experience, his quote almost holds for distributed systems as well™building small servers leads to flexible designs that evolve gracefully over time. However, there is a slight difference, due to network latency, for distributed systems. In a single-process system, it costs almost nothing to make five method calls to an object. If you need to get five related pieces of information, its perfectly fine to make five method calls in fact, its better for code simplicity and maintenance not to have redundant methods. In distributed systems, you need to consider how often those five method calls are made and the impact of network latency on application performance. Well return to this discussion in Chapt er 7 when we talk about interface design. [ 2] And his implied style of programming. Semi-independent is a harder idea to explain. The point is this: if your distributed design requires several instances of a specific server class running in parallel, the instances should be able to run on separate machines or at least in separate JVMs without significantly impacting performance. In other words, these instances should be able to run independently. If instances of a server class need to frequently communicate with each other or share state in some significant way, then theyre not really separate objects, and the design might be flawed. Of course, complete independence is very hard to achieve. For one thing, if the servers all use the same database server, there will always be the possibility that they could interfere with each other. Thats why its called semi-independent.

6.2 Important Questions WhenThinking About Servers