Attributes are string-valued, name-value pairs

Moreover, secondary applications also frequently rely on the same naming structure™just because one application creates the servers and is the main user of the servers doesnt mean its the only application that uses them. For example, a mortgage application analysis at our bank may fetch an account object in order to look at the accounts transaction history. The mortgage application isnt the primary user of the servers, and the code may have been implemented by another group entirely. However, it still uses the hierarchical structure. You need to decide whether youre designing for humans or for machines. There is often a convenient logical structure for servers, and then there is a hierarchical structure that humans can navigate. Humans typically prefer descriptive names and longer paths. On the other hand, flatter hierarchies generally require less code and are easier for programmers and systems administrators to maintain. The ultimate example of this is RMIs flat namespace. It quickly becomes unreadable. But the code that interacts with the RMI registry is very simple. The longer the path, the more opportunities you have for federating the naming service. Well talk more about federation later. For now, its enough to say that federation is a way of moving subcontexts to another machine to enable naming services to scale.

15.1.2 Query by Attribute

Generally, a naming service is not quite static, but a sticky data structure instead. Once servers are bound under a particular logical name, they tend to be bound under that name for awhile. Binding and unbinding are relatively rare operations. Query operations, on the other hand, occur far more frequently than binding or unbinding. Every time a client needs to find a server, it needs to issue a query to the naming service. This implies three basic design requirements: • The naming service has to respond quickly to queries. • Queries from distinct clients should not block each other. Its very important to minimize the use of synchronization in query methods. • The query functionality should be expressive enough to pick out a single server from the ones bound into the naming service. If the naming service often returns a list of servers that the client then narrows down, presumably by querying the server, then using the naming service will be incredibly inefficient.

15.1.2.1 Attributes are string-valued, name-value pairs

One fairly traditional way to implement a query capability is to allow server entries to be annotated with a set of attributes that describe the server. These attributes are metadata that help describe the server and enable the clients to choose a server quickly and easily. Consider our printer application again. We have a document we need to print. Its a color PostScript file for an 18 x 24 poster. To find the correct printer in the RMI registry, we must: 1. Get all the servers by using list . 2. Find out which servers are printers. This involves retrieving the stubs using a lookup method for each name and then using the instanceof keyword to discard the stubs that arent associated with printers. 3. Query each printer we find to discover what sorts of jobs it can handle. Note that even if we subclassed the Printer interface by defining, for example, ColorPrinter or PostscriptPrinter , wed probably still wind up asking it about the paper sizes the printer can handle, and whether the printer is in a nearby location. Surprisingly enough, hierarchies help, but they dont solve this problem entirely. For example, we could define the following printer hierarchy: printerspostscript printerspdf printerspcl Assuming we know the hierarchy e.g., the first classification is based on the printer formatting language and not on whether the printer can handle color, we can easily find a potential match. But hierarchy has its limits. Consider what happens when we add location, paper size, and resolution to the preceding tree. We may wind up with hierarchical paths such as: building47printerspostscriptA12Paper1200DPIColor This type of hierarchy is awful, for a number of reasons. Here are three of the most important: • Many servers are entered multiple times. A server that can print either black-and-white or color documents and can handle either ordinary or legal paper is entered four times in four different contexts in the hierarchy. When the printer gets moved, it needs to be removed from those four contexts and put in four different contexts. This is unmanageable. • The client needs to hardwire in a great deal of assumptions about the hierarchical structure being used. In this example, the top-level context is named printers . The second-level context is used to describe a file format. The third-level context is used to describe whether color documents can be printed, and so on. Hardwiring in all these assumptions about the structure of the naming service leads to brittle code. • All information that can ever be used in a query must be known and expressed in the hierarchal structure. Moreover, the queries must have a value for every possibility. There is no way, in the preceding hierarchy, to express, I need a color printer in building A. Instead, making this query involves multiple calls to the naming service. Reasons such as these lead to the idea of using attributes instead of a hierarchical structure. Attributes are simply name-value pairs, in which both name and value are strings. So for example, we may choose to have a two-tier context structure: printersbuilding-name And then within the building-name context, we may choose to annotate printers with the following three attributes: • Document-type values: PCL, PostScript, PDF • Color-resolution values: color, black-and-white • Dots-resolution values: high, medium, low The idea is that a client must specify the attributes it cares about, and isnt required to specify the rest, nor does the order in which the client specifies the attributes matter. Thus, by passing in a value for the document-type and color-resolution attributes, the client can say, I want a printer that handles color PDF files, but any resolution value is fine. This is a much more natural way to think about servers in a lot of situations. When to Use an Attribute Given that Ive already said there arent very many design principles for hierarchies, its reasonable to wonder if there are any design principles for when to encode structure as an attribute and when to use a hierarchy. The following loose guidelines might be helpful. There are three fairly good design principles for when something ought to be encoded in a hierarchy: • Mutually exclusive possibilities tend to be hierarchical. • Attributes that must be specified, or which are specified in all the use cases, tend to be encoded in a hierarchy. • If the subcontexts can be thought of as a way to cache answers to queries that clients ask often, then having subcontexts and using the stub for them directly from the client side is often a good idea. Thus, for example, location tends be encoded in hierarchical structures. It meets all three of these criteria™a printer can be in only one location, most users really care about the location of the printer, and a given user will likely revisit the location subcontext repeatedly. There are other reasons to make locations a subcontext as well™if we encode location in the hierarchy, we can federate on the basis of location. And thats a potentially huge win in the fight against network latency. On the other hand, the negations of the first two design principles are also fairly good indicators for when a piece of metadata ought to be encoded as an attribute: • If the metadata can have multiple, meaningful, values, it might be better expressed as an attribute. • If the metadata can be easily ignored, or not specified in a query, it might be better specified as an attribute. Thus, for example, the resolution of a printer is probably best left as an attribute. Of course, even in the world of printers, there are borderline cases; for example, color. Generally speaking, color printers can print black-and- white documents. But most organizations have fewer color printers than black-and-white ones. And color printers are far more expensive in terms of cost-per-page than black-and-white printers. This means that, while colorblack-and-white meets the criteria for an attribute both values are possible; if Im printing a black-and-white document, I may very well ignore it since color printers can handle my request, an organization might decide to encode color within the hierarchy anyway. Moreover, it greatly simplifies versioning problems. If all the important server properties were reflected in the hierarchical structure, there would be two problems: All client applications would need to know about all the server properties. Because the properties were encoded in the path, there would be no way to hide the properties. New server properties for example, a new paper size would involve a change to the hierarchy and may break existing code. Because attributes work on partial matches, on the other hand, adding a new attribute is invisible to applications that dont know about the attribute. There is a trade-off here. A naming service has to be very fast and highly reliable. This means that, while we want to implement some sort of querying functionality, its going to be fairly limited. After all, a naming service is not a database. Well add enough functionality and flexibility to handle printers nicely since theyre a pretty typical case, but we wont go much further.

15.2 Requirements for Our Naming Service

The first requirement is to be backwards-compatible with already existing naming services. The second is that future versions of the naming service can be backwards-compatible with our naming service. Unfortunately, there isnt really a reliable way to be backwards-compatible with the RMI registry. First, any calls to static methods on classes in the Javasoft packages are out of our control. Therefore, client code that makes calls to Naming will either communicate with an instance of the RMI registry or not work at all. Second, we will directly support concepts that just arent present in RMI namely, hierarchical structures and attributes. This means that the best we can do is use the same method names bind , unbind , etc. to make it easier for a programmer to translate the client code. As for the second compatibility requirement, there are a few things we can do to make it more likely. The easiest is to simply use objects for arguments. When you use objects, you make it easier to update and alter an interface. Even though a path is, for the most part, a sequence of