Document Type Definition DTD

Chapter 6 • XML and Data Representation 221 DTD Example The following fragment of DTD code defines the production rules for constructing book documents. Listing 6-2: Example DTD 1 ?xml version=1.0 encoding=UTF-8? 2 DOCTYPE mydoc [ 3 ENTITY first SYSTEM first.dtd 4 ENTITY second SYSTEM second.dtd 5 ENTITY third SYSTEM third.dtd first;second;third; 6 ] 7 ELEMENT book title,author,chapter+ 8 ATTLIST book isbn CDATA IMPLIED 9 ELEMENT model datamodel,data_options 10 ELEMENT datamodel EMPTY . . . In the above DTD document fragment, Lines 3 – 5 list three extra DTD documents that will be imported into this one at the time the current document is parsed. Line 7 shows an element definition, the element application consists of optional view elements and at least one model element. Line 8 says that application has an optional attribute, name, of the type character data. Limitations of DTDs DTD provided the first schema for XML documents. Their limitations include: • Language inconsistency since DTD uses a non-XML syntax • Failure to support namespace integration • Lack of modular vocabulary design • Rigid content models cannot derive new type definitions based on the old ones • Lack of integration with data-oriented applications • Conversely, XML Schema allows much more expressive and precise specification of the content of XML documents. This flexibility also carries the price of complexity. W3C is making efforts to phase DTDs out. XML Schema is described in Section 6.2 below.

6.1.3 Namespaces

Inventing new languages is an arduous task, so it will be beneficial if we can reuse parts of an existing XML language defined by a schema. Also, there are many occasions when an XML document needs to use markups defined in multiple schemas, which may have been developed independently. As a result, it may happen that some tag names may be non-unique. For example, the word “title” is used to signify the name of a book or work of art, a form of nomenclature indicating a person’s status, the right to ownership of property, etc. People easily Ivan Marsic • Rutgers University 222 figure out context, but computers are very poor at absorbing contextual information. To simplify the computer’s task and give a specific meaning to what might otherwise be an ambiguous term, we qualify the term with and additional identifier—a namespace identifier. An XML namespace is a collection of names, used as element names or attribute names, see examples in Figure 6-2. The C++ programming language defines namespaces and Java package names are equivalent to namespaces. Using namespaces, you can qualify your elements as members of a particular context, thus eliminating the ambiguity and enabling namespace-aware applications to process your document correctly. In other words: Qualified name QName = Namespace identifier + Local name A namespace is declared as an attribute of an element. The general form is as follows: bk:tagName xmlns :bk = http:any.website.netbook mandatory prefix namespace name There are two forms of namespace declarations due to the fact that the prefix is optional. The first form binds a prefix to a given namespace name. The prefix can be any string starting with a letter, followed by any combination of digits, letters, and punctuation signs except for the colon “:” since it is used to separate the mandatory string xmlns from the prefix, which indicates that we are referring to an XML namespace. The namespace name, which is the attribute value, must be a valid, unique URI. However, since all that is required from the name is its uniqueness, a URL such as http:any.website.netschema also serves the purpose. Note that this does not have to point to anything in particular—it is merely a way to uniquely label a set of names. The namespace is in effect within the scope of the element for which it is defined as an attribute. This means that the namespace is effective for all the nested elements, as well. The scoping properties of XML namespaces are analogous to variable scoping properties in programming languages, such as C++ or Java. The prefix is used to qualify the tag names, as in the following example: Listing 6-3: Example of using namespaces in an XML document. http:any.website.netbook title author chapter paragraph figure caption http:any.website.netperson title name address email phone gender Figure 6-2: Example XML namespaces providing context to individual names. Chapter 6 • XML and Data Representation 223 1 ?xml version=1.0 encoding=UTF-8 standalone=yes? 2 book 3 bk:cover xmlns:bk=http:any.website.netbook 4 bk:titleA Book About Namespacesbk:title 5 bk:authorAnonymousbk:title