Configuring External Entity Resolution for Maximum Performance Using SAX InputSources Improving Performance of Transformations

6-2 Programming XML for Oracle WebLogic Server

6.2 Increasing Performance of XML Validation

If the performance of your XML application decreases due to a parser validation issue, and you need to validate your XML documents, you might improve the performance of your application by writing your own customized code that validates the data as it is being received or parsed, rather than using the setValidating method of the DocumentBuilderFactory or SaxParserFactory. When you turn on validation while parsing an XML document with SAX or DOM, the parser might do more validation of the document than you really need, thus decreasing the overall performance of the application. Instead, consider choosing certain points during the parsing of the document when you want to check that the XML document is valid, and add your own Java code at those points. For example, assume you are writing an application that uses the WebLogic XML Streaming API to processes an XML purchase order. Because you know that the first element of the document should be purchase_order, you can quickly verify that the document appears to be valid by pulling the first element off the stream and checking its name. This check does not ensure that the entire XML document is valid, of course, but you can continue checking for known elements as you pull elements from the stream. These quick checks are much faster than using the standard setValidating methods.

6.3 When to Use XML Schemas or DTDs

There are two ways to describe the structure of an XML document: DTDs and XML Schemas. The current trend is to use Schemas to describe XML documents. Schemas are much more expressive than DTDs because the set of available data types to describe XML elements is much richer and you can describe more specifically what is valid in an XML document. In addition, you can only use Schemas, and not DTDs, in SOAP messages. Because SOAP is the main messaging protocol used in Web services, consider using Schemas to describe any XML documents that might be used as either input or output parameters to Web services. Still, DTDs have a few advantages. DTDs are more widely supported than Schemas, although that is changing rapidly. Because DTDs are less expressive than Schemas, they are easier to write and manage. However, Oracle recommends that you use Schemas to describe your XML documents.

6.4 Configuring External Entity Resolution for Maximum Performance

Oracle highly recommends you store external entities locally whenever possible rather than always retrieving the entity over the network. Storage improves the performance of your applications because it is much faster to look up an entity on the same machine as WebLogic Server than it is to look it up over a network connection. For detailed information on configuring external entity resolution for WebLogic Server, see Section 9.3, External Entity Configuration Tasks.

6.5 Using SAX InputSources

When you use the SAX API to parse an XML document, you first create an InputSource object from the XML document and then pass the InputSource object XML Programming Best Practices 6-3 to the parse method. You can create the InputSource object from either a java.io.InputStream or java.io.Reader object based on your XML data. Oracle recommends that you create an InputSource from a java.io.InputStream object whenever possible. When passed an InputStream object, the SAX parser auto-detects the character encoding of the XML data and automatically instantiates an InputStreamReader object for you, using the correct character encoding. In other words, the parser does all the character encoding work for you, which is more likely to be error-free at runtime than if you decide to specify the character encoding yourself.

6.6 Improving Performance of Transformations

XSLT is a language for transforming an XML document into a different format, such as another XML document, HTML, WML, and so on. To use XSLT, you create a stylesheet that defines how each element in the input XML document should be transformed in the output document. Although XSLT is a powerful language, creating stylesheets for complex transformations can be very complicated. In addition, the actual transformation requires a lot of resources and might decrease the performance of your application. Therefore, if your transformations are complex, consider writing your own transformation code in your application rather than using XSLT stylesheets. Also consider using the DOM API. First parse the XML document, manipulate the resulting DOM tree as needed, then write out the new document, using custom Java code to transform it into its final format. 6-4 Programming XML for Oracle WebLogic Server 7 XML Programming Techniques 7-1 7 XML Programming Techniques The following sections provide information about specific XML programming techniques for developing a J2EE application that processes XML data: ■ Section 7.1, Transmitting XML Data Between A Java Client and WebLogic Server