Moreover, secondary applications also frequently rely on the same naming structure™just because one application creates the servers and
is the main user of the servers doesnt mean its the only application that uses them.
For example, a mortgage application analysis at our bank may fetch an account object in order to look at the accounts transaction history. The
mortgage application isnt the primary user of the servers, and the code may have been implemented by another group entirely. However, it still
uses the hierarchical structure.
You need to decide whether youre designing for humans or for machines.
There is often a convenient logical structure for servers, and then there is a hierarchical structure that humans can navigate. Humans typically
prefer descriptive names and longer paths. On the other hand, flatter hierarchies generally require less code and are easier for programmers
and systems administrators to maintain. The ultimate example of this is RMIs flat namespace. It quickly becomes unreadable. But the code that
interacts with the RMI registry is very simple.
The longer the path, the more opportunities you have for federating the naming service.
Well talk more about federation later. For now, its enough to say that federation is a way of moving subcontexts to another machine to enable
naming services to scale.
15.1.2 Query by Attribute
Generally, a naming service is not quite static, but a sticky data structure instead. Once servers are bound under a particular logical name, they tend to be bound under that name for awhile.
Binding and unbinding are relatively rare operations.
Query operations, on the other hand, occur far more frequently than binding or unbinding. Every time a client needs to find a server, it needs to issue a query to the naming service.
This implies three basic design requirements: •
The naming service has to respond quickly to queries. •
Queries from distinct clients should not block each other. Its very important to minimize the use of synchronization in query methods.
• The query functionality should be expressive enough to pick out a single server from the
ones bound into the naming service. If the naming service often returns a list of servers that the client then narrows down, presumably by querying the server, then using the
naming service will be incredibly inefficient.
15.1.2.1 Attributes are string-valued, name-value pairs
One fairly traditional way to implement a query capability is to allow server entries to be annotated with a set of attributes that describe the server. These attributes are metadata that help
describe the server and enable the clients to choose a server quickly and easily.
Consider our printer application again. We have a document we need to print. Its a color PostScript file for an 18 x 24 poster. To find the correct printer in the RMI registry, we must:
1. Get all the servers by using list
. 2. Find out which servers are printers. This involves retrieving the stubs using a
lookup method for each name and then using the
instanceof keyword to discard the stubs
that arent associated with printers. 3. Query each printer we find to discover what sorts of jobs it can handle. Note that even if
we subclassed the Printer
interface by defining, for example, ColorPrinter
or PostscriptPrinter
, wed probably still wind up asking it about the paper sizes the printer can handle, and whether the printer is in a nearby location.
Surprisingly enough, hierarchies help, but they dont solve this problem entirely. For example, we could define the following printer hierarchy:
printerspostscript printerspdf
printerspcl Assuming we know the hierarchy e.g., the first classification is based on the printer formatting
language and not on whether the printer can handle color, we can easily find a potential match. But hierarchy has its limits. Consider what happens when we add location, paper size, and
resolution to the preceding tree. We may wind up with hierarchical paths such as:
building47printerspostscriptA12Paper1200DPIColor This type of hierarchy is awful, for a number of reasons. Here are three of the most important:
• Many servers are entered multiple times. A server that can print either black-and-white or
color documents and can handle either ordinary or legal paper is entered four times in four different contexts in the hierarchy. When the printer gets moved, it needs to be
removed from those four contexts and put in four different contexts. This is unmanageable.
• The client needs to hardwire in a great deal of assumptions about the hierarchical
structure being used. In this example, the top-level context is named printers
. The second-level context is used to describe a file format. The third-level context is used to
describe whether color documents can be printed, and so on. Hardwiring in all these assumptions about the structure of the naming service leads to brittle code.
• All information that can ever be used in a query must be known and expressed in the
hierarchal structure. Moreover, the queries must have a value for every possibility. There is no way, in the preceding hierarchy, to express, I need a color printer in building A.
Instead, making this query involves multiple calls to the naming service.
Reasons such as these lead to the idea of using attributes instead of a hierarchical structure. Attributes are simply name-value pairs, in which both name and value are strings. So for
example, we may choose to have a two-tier context structure: printersbuilding-name
And then within the building-name context, we may choose to annotate printers with the following three attributes:
• Document-type values: PCL, PostScript, PDF
• Color-resolution values: color, black-and-white
• Dots-resolution values: high, medium, low
The idea is that a client must specify the attributes it cares about, and isnt required to specify the rest, nor does the order in which the client specifies the attributes matter. Thus, by passing in a
value for the document-type and color-resolution attributes, the client can say, I want a printer that handles color PDF files, but any resolution value is fine. This is a much more natural way to
think about servers in a lot of situations.
When to Use an Attribute
Given that Ive already said there arent very many design principles for hierarchies, its reasonable to wonder if there are any design principles
for when to encode structure as an attribute and when to use a hierarchy. The following loose guidelines might be helpful.
There are three fairly good design principles for when something ought to be encoded in a hierarchy:
•
Mutually exclusive possibilities tend to be hierarchical.
•
Attributes that must be specified, or which are specified in all the use cases, tend to be encoded in a hierarchy.
•
If the subcontexts can be thought of as a way to cache answers to queries that clients ask often, then having subcontexts and using
the stub for them directly from the client side is often a good idea.
Thus, for example, location tends be encoded in hierarchical structures. It meets all three of these criteria™a printer can be in only one location,
most users really care about the location of the printer, and a given user will likely revisit the location subcontext repeatedly. There are other
reasons to make locations a subcontext as well™if we encode location in the hierarchy, we can federate on the basis of location. And thats a
potentially huge win in the fight against network latency.
On the other hand, the negations of the first two design principles are also fairly good indicators for when a piece of metadata ought to be
encoded as an attribute:
•
If the metadata can have multiple, meaningful, values, it might be better expressed as an attribute.
•
If the metadata can be easily ignored, or not specified in a query, it might be better specified as an attribute.
Thus, for example, the resolution of a printer is probably best left as an
attribute. Of course, even in the world of printers, there are borderline cases; for
example, color. Generally speaking, color printers can print black-and- white documents. But most organizations have fewer color printers than
black-and-white ones. And color printers are far more expensive in terms of cost-per-page than black-and-white printers.
This means that, while colorblack-and-white meets the criteria for an attribute both values are possible; if Im printing a black-and-white
document, I may very well ignore it since color printers can handle my request, an organization might decide to encode color within the
hierarchy anyway.
Moreover, it greatly simplifies versioning problems. If all the important server properties were reflected in the hierarchical structure, there would be two problems:
All client applications would need to know about all the server properties. Because the properties were encoded in the path, there would be no way to hide the properties.
New server properties for example, a new paper size would involve a change to the hierarchy and may break existing code.
Because attributes work on partial matches, on the other hand, adding a new attribute is invisible to applications that dont know about the attribute.
There is a trade-off here. A naming service has to be very fast and highly reliable. This means that, while we want to
implement some sort of querying functionality, its going to be fairly limited. After all, a naming service is not a database.
Well add enough functionality and flexibility to handle printers nicely since theyre a pretty typical case, but we wont go
much further.
15.2 Requirements for Our Naming Service
The first requirement is to be backwards-compatible with already existing naming services. The second is that future versions of the naming service can be backwards-compatible with our
naming service.
Unfortunately, there isnt really a reliable way to be backwards-compatible with the RMI registry. First, any calls to static methods on classes in the Javasoft packages are out of our control.
Therefore, client code that makes calls to Naming
will either communicate with an instance of the RMI registry or not work at all. Second, we will directly support concepts that just arent present in
RMI namely, hierarchical structures and attributes. This means that the best we can do is use the same method names
bind ,
unbind , etc. to make it easier for a programmer to
translate the client code. As for the second compatibility requirement, there are a few things we can do to make it more
likely. The easiest is to simply use objects for arguments. When you use objects, you make it easier to update and alter an interface. Even though a path is, for the most part, a sequence of