Resource Management Factories and the Activation Framework

Chapter 17. Factories and the Activation Framework

In Chapt er 14 and Chapt er 15 , we discussed how to build a better naming service, one that has a great deal more flexibility than the RMI registry and enables easier lookup of specific servers. However, applications that could potentially have millions of servers still require more infrastructure to help them deal with resource management. In this chapter, well discuss the most common way of achieving this, the factory pattern, and how it is supported in RMI. To do this, well implement a basic factory directly, and then implement similar functionality using RMIs activation framework.

17.1 Resource Management

Our bank example has so far been a small-scale application. While Account is a fairly flexible interface, and you may think you can support millions of accounts using our new naming service and one of the implementations of Account that weve discussed, the fact of the matter is that more infrastructure is required. To see why, consider the Bank of America advertisement quoted in Chapt er 5 : When traveling, take advantage of more than 13,000 Bank of America ATMs coast to coast. Were in 30 states and the District of Columbia. As a Bank of America Check Card or ATM cardholder, theres no ATM fee when you use an ATM displaying a Bank of America sign... ™Bank of America advertisement Thats 13,000 dedicated client machines. Plus, there are the client applications running inside each branch of the bank, the central reporting and analysis applications each division of the bank runs, and all the new Internet services that our hypothetical bank wants to roll out over the next few years. In short, we have the following situation: A potentially unbounded number of client applications running over a period of time. Practically speaking, there wont be many more clients running than there are accounts. So a good upper boundary on the number of clients is the number of open accounts. For a large bank, this can be over 10 million. Most servers will be active occasionally. Most people look at their account balances and information at least once a month. In addition, automatic bill-paying programs and other advanced services will probably require access to account information. Most servers will be inactive most of the time. Most people dont look at their account balances and information more than once a day. Since such usage, along with monthly and weekly reporting functionality, is the vast majority of anticipated use, it follows that most accounts will be inactive most of the time. Most clients want to access a small number of accounts for a short period of time. Were assuming our previous model of client-interaction is probably correct for most applications. We also know that each JVM has a limited number of available sockets and, as a practical matter, will run into problems supporting large numbers of clients. How many clients can a JVM support? It depends on the application, of course. The absolute limit, based on the number of sockets a process can open, is around 1,000 for most operating systems. But reports from the trenches occasionally suggest that a more reasonable limit for an RMI server is between 150 and 200 simultaneous client connections from distinct client computers. Once past t hat, the RMI runtime apparently bogs down. Suppose we take the current implementation and try to make it scale, using the following seat-of- the-pants assumptions: • There are 10 million accounts. • We launch 200 accounts per JVM. • We run 25 JVMs per machine on our server farm. Simple arithmetic leads us to conclude that we need 2,000 servers on our server farm, which is, quite honestly, a ridiculous number. Consuming vast amounts of resources to keep mostly idle servers available on a 247 basis is a bad idea. Moreover, 2,000 server computers, running 50,000 JVMs, would completely overwhelm any distributed garbage collection scheme based on a centralized naming service. The fact of the matter is, we cant take our existing architecture, move servers to different JVMs, and then scale to millions of servers. Looking at the numbers reveals a surprising assumption: our calculations assume that a server computer is capable of handling 5,000 clients 25 JVMS times 200 clients per JVM. This is quite a bit on the high side. However, this is good for a seat-of-the-pants estimate. Replacing our numbers with more reasonable ones only leads to an increase in the number of computers required. In Par t I , we used multiple instances of Account for our servers. The main reason for this decision was that, by and large, smaller servers are easier to write, maintain, and verify. However, the previous discussion may cause you to revisit that decision. After all, if we went with the bank option, servers wouldnt be account-specific. Instead, account identifiers would get passed in as arguments with each method call. This partially solves the resource problem weve been discussing. We can assume: • There are 10 million accountsbut this is irrelevant. • We launch 1 server per JVM. • We run 25 JVMs per machine on our server farm. Hence, the number of server machines we need is simply a function of our expected number of clients. Each computer in the server farm is capable of handling 5,000 simultaneous clients. [ 1] If we expect to handle 20,000 simultaneous client requests, then we need only 10 or so servers, building in a margin of error for peak activity. [ 1] Again, not really. See the note about unrealistic assumptions. This is a dilemma. As developers, wed like to go with the account servers. But, from a deployment point of view, the bank option looks compelling.

17.2 Factories