Chapter 6 • XML and Data Representation
221
DTD Example
The following fragment of DTD code defines the production rules for constructing book documents.
Listing 6-2: Example DTD 1 ?xml version=1.0 encoding=UTF-8?
2 DOCTYPE mydoc [ 3 ENTITY first SYSTEM first.dtd
4 ENTITY second SYSTEM second.dtd 5 ENTITY third SYSTEM third.dtd first;second;third;
6 ] 7 ELEMENT book title,author,chapter+
8 ATTLIST book isbn CDATA IMPLIED 9 ELEMENT model datamodel,data_options
10 ELEMENT datamodel EMPTY . . .
In the above DTD document fragment, Lines 3 – 5 list three extra DTD documents that will be imported into this one at the time the current document is parsed. Line 7 shows an element
definition, the element application consists of optional view elements and at least one model
element. Line 8 says that application has an optional attribute, name, of the type character
data.
Limitations of DTDs
DTD provided the first schema for XML documents. Their limitations include: • Language inconsistency since DTD uses a non-XML syntax
• Failure to support namespace integration • Lack of modular vocabulary design
• Rigid content models cannot derive new type definitions based on the old ones • Lack of integration with data-oriented applications
• Conversely, XML Schema allows much more expressive and precise specification of the
content of XML documents. This flexibility also carries the price of complexity. W3C is making efforts to phase DTDs out. XML Schema is described in Section 6.2 below.
6.1.3 Namespaces
Inventing new languages is an arduous task, so it will be beneficial if we can reuse parts of an existing XML language defined by a schema. Also, there are many occasions when an XML
document needs to use markups defined in multiple schemas, which may have been developed independently. As a result, it may happen that some tag names may be non-unique.
For example, the word “title” is used to signify the name of a book or work of art, a form of nomenclature indicating a person’s status, the right to ownership of property, etc. People easily
Ivan Marsic • Rutgers University
222
figure out context, but computers are very poor at absorbing contextual information. To simplify the computer’s task and give a specific meaning to what might otherwise be an ambiguous term,
we qualify the term with and additional identifier—a namespace identifier.
An XML namespace is a collection of names, used as element names or attribute names, see examples in Figure 6-2. The C++ programming language defines namespaces and Java package
names are equivalent to namespaces. Using namespaces, you can qualify your elements as members of a particular context, thus eliminating the ambiguity and enabling namespace-aware
applications to process your document correctly. In other words:
Qualified name QName = Namespace identifier + Local name A namespace is declared as an attribute of an element. The general form is as follows:
bk:tagName xmlns :bk = http:any.website.netbook
mandatory prefix namespace
name
There are two forms of namespace declarations due to the fact that the prefix is optional. The first form binds a prefix to a given namespace name. The prefix can be any string starting with a letter,
followed by any combination of digits, letters, and punctuation signs except for the colon “:” since it is used to separate the mandatory string xmlns from the prefix, which indicates that we
are referring to an XML namespace. The namespace name, which is the attribute value, must be a valid, unique URI. However, since all that is required from the name is its uniqueness, a URL
such as http:any.website.netschema also serves the purpose. Note that this does not have to point to anything in particular—it is merely a way to uniquely label a set of names.
The namespace is in effect within the scope of the element for which it is defined as an attribute. This means that the namespace is effective for all the nested elements, as well. The scoping
properties of XML namespaces are analogous to variable scoping properties in programming languages, such as C++ or Java. The prefix is used to qualify the tag names, as in the following
example:
Listing 6-3: Example of using namespaces in an XML document.
http:any.website.netbook
title author
chapter paragraph
figure caption
http:any.website.netperson
title name
address email
phone gender
Figure 6-2: Example XML namespaces providing context to individual names.
Chapter 6 • XML and Data Representation
223
1 ?xml version=1.0 encoding=UTF-8 standalone=yes? 2 book
3 bk:cover
xmlns:bk=http:any.website.netbook
4 bk:titleA Book About Namespacesbk:title 5 bk:authorAnonymousbk:title