Implementing Lookup Tables More Sophisticated Techniques

page 128 Indicates that what follows must be a top-level element of the stylesheet. This syntax starts at the root of the document, then has a single element. The elements name can be anything. For our current stylesheet, we could have written the XPath expression like this: select=documentxsl:stylesheetstates:name[abbrev=current] Because the root element of a stylesheet can be either xsl:stylesheet or xsl:transform , its better to use the asterisk. states:name Indicates a name element combined with a namespace prefix that maps to http:new.usps.comcgi-binuspsbvscriptscontent.jsp?D=10090 . If we were referencing elements in another document, the prefix wouldnt have to be states ; it could be anything, as long as it mapped to the same string. [abbrev=current] Means that the abbrev attribute of the current states:name element has the same value as the current node. We have to use the XSLT current function here because we want the current node, not the context node. Inside the predicate expression, the current node is the state element we process, while the context node is the states:name element that contains the abbrev attribute we evaluate. Figure 7-3 shows the output from the stylesheet with a lookup table. Figure 7-3. Document generated with a lookup table page 129 Notice that now the purchase orders have been sorted by the actual name of the state referenced in the address, not by the states abbreviation. Lookup tables are an extremely useful side effect of the way the document function works. You could place a lookup table in another file and you could use the document function for other purposes, but the technique weve covered here is the most common way to implement lookup tables.

7.4.3 Grouping Across Multiple Documents

Our final task will be to group our collection of purchase orders. Well create a new listing that groups all the purchase orders by the state to which they were shipped. Well start by attempting the grouping technique we used earlier. The most efficient grouping technique we used before was to use the XSLT key function along with the XPath generate-id function. We create a key for the nodes we want to index in this case, the state elements, then compare each address we find to the first value returned by the key function. Heres how we define the key: xsl:key name=states match=documentreportpofilenamepurchase-ordercustomeraddress use=state Unfortunately, the match attribute of the xsl:key element cant begin with a call to the document function. Maybe we could try creating a variable that contains all the nodes we want to use, then use that node-set to create the key: xsl:variable name=addresses select=documentreportpofilenamepurchase-ordercustomeraddress xsl:key name=states match=addresses use=state This doesnt work either; you cant use a variable in the match attribute. Our hopes for a quick solution to this problem are fading quickly. Complicating the problem is the fact that axes wont help, either. Trying to use the preceding:: axis to see if a previous purchase order came from the current state also doesnt work. Consider this example: xsl:if test=notpreceding::address[state=state] When we were working with a single document, the preceding:: axis gave us useful information. Because all of the nodes were working with are now in separate documents, the various axes defined in XPath wont help. When I ask for any nodes in the preceding:: axis, I only get nodes from the current document. Were going to have to roll up our sleeves and do this the hard way. Now that were resigned to grouping nodes with brute force, well try to make the process as efficient as possible. For performance reasons, we want to avoid having to call the document function any more than we have to. This wont be pretty, but heres our approach: • Use the document function to retrieve the values of all of the state elements. To keep things simple, well write these values out to a string, separating them with spaces. Well also use the xsl:sort element to sort the state elements; that will save us some time later. • Take our string of sorted, space-separated state names to be precise, theyre the values of all the state elements and remove the duplicates. Because things are sorted, I only have to compare two adjacent values. Well use recursion to handle this. • For each item in our string of sorted, space-separated, unique state names, use the document function to see which purchase orders match the current state. page 130 This certainly isnt efficient; for each unique state, well have to call the document function once for every filename attribute. In other words, if we had 500 purchase orders from 50 unique states, we would have to open each of those 500 documents 51 times, invoking the document 25,500 times Its not pretty, but it works. Retrieving the values of all state elements is relatively straightforward. Well use the technique of creating a variable whose value contains output from an xsl:for-each element: xsl:variable name=list-of-states xsl:for-each select=documentreportpofilenamepurchase-ordercustomeraddressstate xsl:sort select=documentstates:name[abbrev=current] xsl:value-of select=.xsl:text xsl:text xsl:for-each xsl:variable This code produces the string ME MA MA WI for our current set of purchase orders. Our next step will remove any duplicate values from the list. Well do this with recursion, using the following algorithm: • Call our recursive template with two arguments: the list of states and the name of the last state we found. the first time we invoke this template, the name of the last state will be blank. • Break the list of states into two parts: The first state in the list, followed by the remaining states in the list. • If the list of states is empty, exit. If the first state in the list is different from the last state we found, output the first state and invoke the template on the remaining states on the list. If the first state in the list is the same as the last state we found, simply invoke the template on the remaining states on the list. Again, we use our technique of calling this template inside an xsl:variable element to save the list of unique states for later. Here is the xsl:variable element, along with the recursive template that removes duplicate state names from the string: xsl:variable name=list-of-unique-states xsl:call-template name=remove-duplicates xsl:with-param name=list-of-states select=list-of-states xsl:with-param name=last-state select= xsl:call-template xsl:variable xsl:template name=remove-duplicates xsl:param name=list-of-states xsl:param name=last-state select= xsl:variable name=next-state xsl:value-of select=substring-beforelist-of-states, xsl:variable xsl:variable name=remaining-states xsl:value-of select=substring-afterlist-of-states, xsl:variable xsl:choose xsl:when test=notstring-lengthnormalize-spacelist-of-states -- If the list of states is empty, do nothing -- xsl:when xsl:when test=notlast-state=next-state xsl:value-of select=next-state xsl:text xsl:text xsl:call-template name=remove-duplicates