Leasing Distributed Garbage Collection

and rely on the system to free _balance when _balance is no longer reachable.

16.2.2 Defining Network Garbage

Garbage collection works well inside a single JVM. However, a problem with stubs arises in a distributed system. If a client has a stub that references a server, then that server should be reachable. Since all the stub really has is an instance of ObjID , this means that the RMI runtime is really keeping references to all the active servers. In order to do garbage collection, the RMI runtime on a client machine must somehow let the RMI runtime on a server machine know when a stub is no longer being used. The obvious way to do this is to use distributed reference counting. That is, force the stub on a client machine to send two additional messages to the server. One message is sent when the stub is instantiated, to let the server know that there is an active client. The other message is sent when the stub is freed, to let the server know when the client is finished using the server. The RMI runtime on the server, meanwhile, simply keeps track of the number of active clients. When a stub sends the first message, the RMI runtime on the server increments the number of active clients. When a stub sends the second message, the RMI runtime on the server decrements the number of active clients. When there are no active clients, it removes the server from its indexing scheme and thus makes it possible for local garbage collection to reclaim the server. This scheme, while an obvious first step, is very fragile. Each of the following problems makes distributed reference counting difficult: • The clients garbage collection algorithm isnt guaranteed to reclaim the stub right away. If the client isnt running low on memory, and is busy performing a high-priority task, garbage collection may not happen for a while. During that time period, the client is implicitly forcing the server to hold on to unnecessary resources. • The client can crash. This is a much more serious problem. If a client crashes, then the second message is never sent. The servers reference count never gets to 0, and the RMI runtime keeps the server object active forever. • Network problems may arise. Even if the client is well-behaved and sends its message, the network may be down. In which case, the RMI runtime never decrements the servers reference count to 0, and the server object is kept active forever.

16.2.3 Leasing

Of these problems, the first is impossible to overcome. The Java language specification clearly states that local garbage collection is undependable. Garbage collection will happen, but you have no way to force it to happen within a certain time frame. Since you have no way of knowing that a stub is unreferenced until garbage collection runs, any distributed reference-counting architecture will simply have to live with the first problem. The second and third problems, however, can be eliminated by making all distributed references temporary references. This idea is known as leasing. The basic algorithm is this: 1. A client calls the server and requests a lease for a period of time. 2. The server responds back, granting a lease for a period of time not necessarily the original amount of time. 3. During this period of time, the distributed reference count includes the client. 4. When the lease expires, if the client hasnt requested an extension, the distributed reference count is automatically decremented. Clients automatically try to renew leases as long as a stub hasnt been garbage collected. Given this, if we look at the problems with distributed reference counting, well see that the second and third problems have been neatly solved. If the client crashes, then the client application is no longer running, and therefore, the client application certainly isnt attempting to renew leases. Consequently, the lease wont be renewed, and the server is eventually garbage collected. Similarly, if network problems prevent the client application from connecting to the server, the client application wont renew the lease. In this case, the lease wont be renewed and the server will eventually get garbage collected. The default value of a lease is 10 minutes. This is configurable using the java.rmi.dgc.leasevalue property. Well discuss in more detail how to configure the default value, and why you would do so, later in this chapter.

16.2.4 The Actual Distributed Garbage Collector