Summary Branching and Control Elements

page 82

Chapter 5. Creating Links and Cross-References

If youre creating a web site, publishing a book, or creating an XML transaction, chances are many pieces of information will refer to other things. This chapter discusses a several ways to link XML elements. It reviews three techniques: • Using the id function • Doing more advanced linking with the key function • Generating links in unstructured documents

5.1 Generating Links with the id Function

Our first attempt at linking will be with the XPath id function.

5.1.1 The ID, IDREF, and IDREFs Datatypes

Three of the basic datatypes supported by XML Document Type Definitions DTDs are ID , IDREF , and IDREFS . Heres a simple DTD that illustrates these datatypes: --glossary.dtd-- --The containing tag for the entire glossary-- ELEMENT glossary glentry+ --A glossary entry-- ELEMENT glentry term,defn+ --The word being defined-- ELEMENT term PCDATA --The id is used for cross-referencing, and the xreftext is the text used by cross-references.-- ATTLIST term id ID REQUIRED xreftext CDATA IMPLIED --The definition of the term-- ELEMENT defn PCDATA | xref | seealso --A cross-reference to another term-- ELEMENT xref EMPTY --refid is the ID of the referenced term-- ATTLIST xref refid IDREF REQUIRED --seealso refers to one or more other definitions-- ELEMENT seealso EMPTY ATTLIST seealso refids IDREFS REQUIRED In this DTD, each term element is required to have an id attribute, and each xref element must have an refid attribute. The ID and IDREF datatypes work according to two rules: • Each value of the id attribute must be unique. • Each value of the refid attribute must match a value of an id attribute elsewhere in the document. page 83 To round out our example, the seealso element contains an attribute of type IDREFS . This datatype contains one or more values, each of which must match a value of an ID elsewhere in the document. Multiple values, if present, are separated by whitespace. There are some complications of ID and related datatypes, but well discuss them later. For now, well focus on how the id function works.

5.1.2 An XML Document in Need of Links

To illustrate the value of linking, well use a small glossary written in XML. The glossary contains some glentry elements, each of which contains a single term and one or more defn elements. In addition, a definition is allowed to contain a cross-reference xref to another term . Heres a short sample document: ?xml version=1.0 ? DOCTYPE glossary SYSTEM glossary.dtd glossary glentry term id=appletappletterm defn An application program, written in the Java programming language, that can be retrieved from a web server and executed by a web browser. A reference to an applet appears in the markup for a web page, in the same way that a reference to a graphics file appears; a browser retrieves an applet in the same way that it retrieves a graphics file. For security reasons, an applets access rights are limited in two ways: the applet cannot access the file system of the client upon which it is executing, and the applets communication across the network is limited to the server from which it was downloaded. Contrast with xref refid=servlet. seealso refids=wildcard-char DMZlong pattern-matching defn glentry glentry term id=DMZlong xreftext=demilitarized zonedemilitarized zone DMZterm defn In network security, a network that is isolated from, and serves as a neutral zone between, a trusted network for example, a private intranet and an untrusted network for example, the Internet. One or more secure gateways usually control access to the DMZ from the trusted or the untrusted network. defn glentry glentry term id=DMZDMZterm defn See xref refid=DMZlong. defn glentry glentry term id=pattern-matchingpattern-matching characterterm defn A special character such as an asterisk or a question mark ? that can be used to represent zero or more characters. Any character or set of characters can replace a pattern-matching character. defn glentry