page 42
•
The XML 1.0 specification defines two attributes
xml:lang
and
xml:space
that work like default namespaces. In other words, if the
auth:author
element in our sample document contains the attribute
xml:lang=en_us
, that attribute applies to all elements contained inside
auth:author
. Even though that attribute might apply to the
last-name
element,
last-name
wont have an attribute node named
xml:lang
. Similarly, the
xml:space
defines whether whitespace in an element should be preserved; valid values for this attribute are
preserve
and
default
. Whether these attributes are in effect for a given element or not, the only attribute nodes an element
node contains are those tagged in the document and those defined with a default value in the DTD.
For more information on language codes and whitespace handling, see the discussions of the XPath lang
function and the XSLT
xsl:preserve-space
and
xsl:strip- space
elements.
3.1.4 Text Nodes
Text nodes are refreshingly simple; they contain text from an element. If the original text in the XML document contained entity or character references, they are resolved before the
XPath text node is created. The text node is text, pure and simple. A text node is required to contain as much text as possible; the next or previous node cant be a text node.
You might have noticed that there are no CDATA nodes in this list. If your XML document contains text in a CDATA section, you can access the contents of the CDATA section as a
text node. You have no way of knowing if a given text node was originally a CDATA section. Similarly, all entity references are resolved before anything in your stylesheet is
evaluated, so you have no way of knowing if a given piece of text originally contained entity references.
3.1.5 Comment Nodes
A comment node is also very simple—it contains some text. Every comment in the source document except for comments in the DTD becomes a comment node. The text of the
comment node returned by the
text
node test contains everything inside the comment, except the opening
--
and the closing
--
.
3.1.6 Processing Instruction Nodes
A processing instruction node has two parts, a name returned by the
name
function and a string value. The string value is everything after the name, including whitespace, but not
including the
?
that closes the processing instruction.
3.1.7 Namespace Nodes
Namespace nodes are almost never used in XSLT stylesheets; they exist primarily for the XSLT processors benefit. Remember that the declaration of a namespace such as
xmlns:auth=http:www.authors.net
, even though it is technically an attribute in the XML source, becomes a namespace node, not an attribute node.
3.2 Location Paths
One of the most common uses of XPath is to create location paths. A location path describes the location of something in an XML document. In our examples in the previous chapter, we
used location paths on the
match
and
select
attributes of various XSLT elements. Those
page 43
location paths described the parts of the XML document we wanted to work with. Most of the XPath expressions youll use are location paths, and most of them are pretty simple. Before
we dive in to the wonders of XPath, we need to discuss the context.
3.2.1 The Context
One of the most important concepts in XPath is the context. Everything we do in XPath is interpreted with respect to the context. You can think of an XML document as a hierarchy of
directories in a filesystem. In our sonnet example, we could imagine that
sonnet
is a directory at the root level of the filesystem. The
sonnet
directory would, in turn, contain directories named
auth:author
,
title
, and
lines
. In this example, the context would be the current directory. If I go to a command line and execute a particular command such as
dir .js
, the results I get vary depending on the current directory. Similarly, the results of evaluating
an XPath expression will probably vary based on the context. Most of the time, we can think of the context as the node in the tree from which any
expression is evaluated. To be completely accurate, the context consists of five things:
•
The context node the current directory. The XPath expression is evaluated from this node.
•
Two integers, the context position and the context size. These integers are important when were processing a group of nodes. For example, we could write an XPath
expression that selects all of the
li
elements in a given document. The context size refers to the number of
li
items selected by that expression, and the context position refers to the position of the
li
were currently processing.
•
A set of variables. This set includes names and values of all variables that are currently in scope.
•
A set of all the functions available to XPath expressions. Some of these functions are defined by the XPath and XSLT standards themselves; others might be extension
functions defined by whomever created the stylesheet. Youll read more about extension functions in
Chapter 8 .
•
A set of all the namespace declarations currently in scope. Having said all that, most of the time you can ignore everything but the context node. To use
our command line analogy one more time, if youre at a command line, you have a current directory; you also have depending on your operating system a number of environment
variables defined. For most commands, you can focus on the current directory and ignore the environment variables.
3.2.2 Simple Location Paths
Now that weve talked about what a context is and why it matters, well look at some location paths. Well start with a variety of simple location paths; as we go along, well look at more
complex location paths that use all the various features of XPath. We already looked at one of the simplest XPath expressions:
xsl:template match=
This template selects the root node of the document. We saw another simple XPath expression in the
xsl:value-of
element:
xsl:value-of select=.