page 41
3.1.2 Element Nodes
Every element in the original XML document is represented by an XPath element node. In the previous document, an element node exists for the
sonnet
element, the
auth:author
element, the
last-name
element, etc. An element nodes children include text nodes, element nodes, comment nodes, and processing instruction nodes that occur within that
element in the original document. An element nodes string value returned by
xsl:value-of select=sonnet
, for example is the concatenation of the text of this node and all of its children, in document order the
order in which they appear in the original document. All entity references such as
lt;
and character references such as
052;
in the text are resolved automatically; you cant access the entity or character references from XPath.
The name of an element node returned by the XPath
name
function is the element name and any namespace in effect. In the previous example, the
name
of the
sonnet
element is
sonnet
. The
name
of the
auth:author
element is
auth:author
, and the
name
of the
last-name
element is
auth:last-name
any element contained in the
author
element is from the
auth
namespace unless specifically declared otherwise. Other XPath functions, such as
local-name
and
namespace-uri
, return other information about the name of the element node.
3.1.3 Attribute Nodes
At a minimum, an element node is the parent of one attribute node for each attribute in the XML source document. In our sample document, the element node corresponding to the
sonnet
element is the parent of an attribute node with a name of
type
and a value of
Shakespearean
. A couple of complications for attribute nodes exist, however:
•
Although an element node is the parent of its attribute nodes, those attribute nodes are not children of their parent. The children of an element are the text, element,
comment, and processing instruction nodes contained in the original element. If you want a documents attributes, you must ask for them specifically. That relationship
seems odd at first, but youll find that treating an elements attributes separately is usually what you want to do.
•
If a DTD or schema defines default values for certain attributes, those attributes dont have to appear in the XML document. For example, we could have declared that a
Shakespearean
sonnet is the default type, so that the tag
sonnet type=Shakespearean
is functionally equivalent to
sonnet
. Under normal circumstances, XPath creates an attribute node for all attributes with default values,
whether they actually appear in the document or not. If the
type
is defined as
IMPLIED
, both of the
sonnet
elements we just mentioned will have an attribute node with a name of
type
and a value of
Shakespearean
. Of course, if the document codes a value other than the default
sonnet type=Petrarchan
, for example, the attribute nodes value will be whatever was coded in the document.
To make this situation even worse, an XML parser isnt required to read an external DTD. If it doesnt, then any attribute nodes that represent default values not coded in
the document wont exist. Fortunately, XSLT has some branching elements
xsl:if
and
xsl:choose
that can help you deal with these ambiguities; well discuss those in
Chapter 4 .
page 42
•
The XML 1.0 specification defines two attributes
xml:lang
and
xml:space
that work like default namespaces. In other words, if the
auth:author
element in our sample document contains the attribute
xml:lang=en_us
, that attribute applies to all elements contained inside
auth:author
. Even though that attribute might apply to the
last-name
element,
last-name
wont have an attribute node named
xml:lang
. Similarly, the
xml:space
defines whether whitespace in an element should be preserved; valid values for this attribute are
preserve
and
default
. Whether these attributes are in effect for a given element or not, the only attribute nodes an element
node contains are those tagged in the document and those defined with a default value in the DTD.
For more information on language codes and whitespace handling, see the discussions of the XPath lang
function and the XSLT
xsl:preserve-space
and
xsl:strip- space
elements.
3.1.4 Text Nodes
Text nodes are refreshingly simple; they contain text from an element. If the original text in the XML document contained entity or character references, they are resolved before the
XPath text node is created. The text node is text, pure and simple. A text node is required to contain as much text as possible; the next or previous node cant be a text node.
You might have noticed that there are no CDATA nodes in this list. If your XML document contains text in a CDATA section, you can access the contents of the CDATA section as a
text node. You have no way of knowing if a given text node was originally a CDATA section. Similarly, all entity references are resolved before anything in your stylesheet is
evaluated, so you have no way of knowing if a given piece of text originally contained entity references.
3.1.5 Comment Nodes
A comment node is also very simple—it contains some text. Every comment in the source document except for comments in the DTD becomes a comment node. The text of the
comment node returned by the
text
node test contains everything inside the comment, except the opening
--
and the closing
--
.
3.1.6 Processing Instruction Nodes
A processing instruction node has two parts, a name returned by the
name
function and a string value. The string value is everything after the name, including whitespace, but not
including the
?
that closes the processing instruction.
3.1.7 Namespace Nodes
Namespace nodes are almost never used in XSLT stylesheets; they exist primarily for the XSLT processors benefit. Remember that the declaration of a namespace such as
xmlns:auth=http:www.authors.net
, even though it is technically an attribute in the XML source, becomes a namespace node, not an attribute node.
3.2 Location Paths
One of the most common uses of XPath is to create location paths. A location path describes the location of something in an XML document. In our examples in the previous chapter, we
used location paths on the
match
and
select
attributes of various XSLT elements. Those