A sample DOM tree

page 18 The image in Figure 1-2 was produced by the DOMit servlet, an XML validation service available at http:www- 106.ibm.comdeveloperworks featuresxmlvalidatorform.html . If we want to find different parts of our XML document, sort the subtrees based on the first character of the text of the postalcode element, or select only the subtrees in which the text of the usage-count element has a numeric value greater than 500, we have to start at the top of the DOM tree and work our way down through the root elements descendants. When we write XSLT stylesheets, we also start at the root of the tree and work our way down. To be honest, the DOM tree built for our document is more complicated than our beautiful picture indicates. The whitespace characters in our document carriage returnline feed, tabs, spaces, etc. become Text nodes. Normally its a good idea to remove this whitespace so the DOM tree wont be littered with these useless Text nodes, but I included them here to give you a sense of the XML documents structure.

1.2.3.2 SAX

The Simple API for XML was developed by David Megginson and others on the XML-DEV mailing list. It has several important differences from DOM: • The SAX API is interactive. As a SAX parser processes your document, it sends events to your code. You dont have to wait for the parser to finish the entire document as you do with the DOM; you get events from the parser immediately. These events let you know when the parser finds the start of the document, the start of an element, some text, the end of an element, a processing instruction, the end of the document, etc. • SAX is designed to avoid the large memory footprint of DOM. In the SAX world, youre told when the parser finds things in the XML document; its up to you to save those things. If you dont do anything to store the data found by the parser, it goes into the bit bucket. • SAX doesnt provide the hierarchical view of the document that DOM does. If you need to know a lot about the structure of an XML document and the context of a given element, SAX isnt much help. Each SAX event is stateless; that is, a SAX event wont tell you, Heres some text for the postalcode element I mentioned earlier. A SAX parser only tells you, Heres some text. If you need to know about an XML documents structure, you have to keep track of that information yourself. The best thing about SAX is that it is interactive. Most of the transformations currently done with XSLT take place on the server. As of this writing, most XSLT processors are based on DOM parsers. In the near future, however, well see XSLT processors based on SAX parsers. This means that the processor can start generating results almost as soon as the parse of the source document begins, resulting in better throughput and creating the perception of faster page 19 service. Because DOM, XPath, and XSLT all use trees to represent XML documents, DOM is more relevant to our discussions here. Nevertheless, its useful to know how SAX parsers