Solution 4: Use an extension function

page 99 xsl:textGlossary Listing: xsl:text xsl:value-of select=glentry[1]term xsl:text - xsl:text xsl:value-of select=glentry[last]term title head body h1 xsl:textGlossary Listing: xsl:text xsl:value-of select=glentry[1]term xsl:text - xsl:text xsl:value-of select=glentry[last]term h1 xsl:apply-templates select=glentry body html xsl:template xsl:template match=glentry p b a name={termid} xsl:value-of select=term xsl:text: xsl:text b xsl:apply-templates select=defn p xsl:template xsl:template match=defn xsl:apply-templates select=|comment|processing-instruction|text xsl:template xsl:template match=xref a href={refid} xsl:choose xsl:when test=keyterm-ids, refid[1]xreftext xsl:value-of select=keyterm-ids, refid[1]xreftext xsl:when xsl:otherwise xsl:value-of select=keyterm-ids, refid[1] xsl:otherwise xsl:choose a xsl:template xsl:template match=seealso b xsl:textSee also: xsl:text b xsl:for-each select=java:org.apache.xalan.lib.Extensions.tokenizerefids a href={keyterm-ids, .id} xsl:choose xsl:when test=keyterm-ids, .xreftext xsl:value-of select=keyterm-ids, .xreftext xsl:when xsl:otherwise xsl:value-of select=keyterm-ids, . xsl:otherwise xsl:choose a xsl:if test=notposition=last xsl:text, xsl:text xsl:if xsl:for-each xsl:text.xsl:text xsl:template xsl:stylesheet page 100 In this case, the tokenize function defined in the Java class org.apache.xalan.lib. Extensions takes a string as input, then converts the string into a node-set in which each token in the original string becomes a node. Be aware that using extension functions limits the portability of your stylesheets. The extension function here does what we want, but we couldnt use this extension function with Saxon, XT, or the XSLT tools from Oracle or Microsoft. They may or may not supply similar functions, and if they do, youll have to modify your stylesheet slightly to use them. If its important to you that you be able to switch XSLT processors at some point in the future, using extensions will limit your ability to do that. Hopefully at this point youre convinced of at least one of the following two things: • If you have an attribute with a datatype of IDREFS , you should use the id function to resolve cross-references. • The IDREFS datatype is pretty limited, so you should avoid using it.

5.2.4 Advantages of the key Function

Now that weve taken the key function through its paces, you can see that it has several advantages: • The key function is defined in a stylesheet. That means I can define any number of relationships between parts of an XML document at any time. If I need to define a new relationship tomorrow, I dont have to change my XML documents. • Any number of key functions can be defined for a given element. In our glossary example, we could define key functions for the values of the language , topic , and acronym attributes. We could also create key functions based on the text of various elements or their children. If we used ID s instead of the key function, we would be limited to a single index based on the value of the single attribute of the ID datatype. To sum up the advantages for this point, an element can have more than one key defined against it, and that key doesnt have to be based on an attribute. The key can be based on the elements text, the text of child elements, or other constructs. • Any number of elements can match a given value. Taking another look at our glossary example, when we use the key function to find all defn elements that are written in a particular language, the function returns a node-set that can have any number of nodes. If we use an ID instead, legally there can be only one element that matches a given ID value. • The value we use to look up elements in the key function isnt constrained to be an XML name. If we use the ID datatype, its value cant contain spaces, among other constraints.

5.3 Generating Links in Unstructured Documents

Before we leave the topic of linking, well discuss one more useful technique. So far, all of this chapters examples have been structured nicely. When there was a relationship between two pieces of information, we had an id and refid pair to match them. What happens if the XML document youre transforming isnt written that way? Fortunately, we can use the key function and a new function, generate-id , to create structure where there isnt any. page 101

5.3.1 An Unstructured XML Document in Need of Links

For our example here, well take out all of the id and refid attributes that have served us well so far. This may be a contrived example, but it demonstrates how we can use the key and generate-id functions to generate links between parts of our document. In our new sample document, weve stripped out the references that neatly tied things together before: ?xml version=1.0 ? DOCTYPE glossary SYSTEM unstructuredglossary.dtd glossary glentry termappletterm defnAn application program, written in the Java programming language, that can be retrieved from a web server and executed by a web browser. A reference to an applet appears in the markup for a web page, in the same way that a reference to a graphics file appears; a browser retrieves an applet in the same way that it retrieves a graphics file. For security reasons, an applets access rights are limited in two ways: the applet cannot access the file system of the client upon which it is executing, and the applets communication across the network is limited to the server from which it was downloaded. Contrast with reftermservletrefterm. defn glentry glentry termdemilitarized zoneterm defn In network security, a network that is isolated from, and serves as a neutral zone between, a trusted network for example, a private intranet and an untrusted network for example, the Internet. One or more secure gateways usually control access to the DMZ from the trusted or the untrusted network. defn glentry glentry termDMZterm defn See reftermdelimitarized zonerefterm. defn glentry glentry termpattern-matching characterterm defn A special character such as an asterisk or a question mark ? that can be used to represent zero or more characters. Any character or set of characters can replace a pattern-matching character. defn glentry glentry termservletterm defn An application program, written in the Java programming language, that is executed on a web server. A reference to a servlet appears in the markup for a web page, in the same way that a reference to a graphics file appears. The web server executes the servlet and sends the results of the execution if there are any to the web browser. Contrast with reftermappletrefterm.