Directory listing of http: uap.unnes.ac.id ebook biblebook XML Bible

XM L Bible


Elliotte Rusty Harold

IDG Bo o ks Wo rldwide, Inc .
An Internatio nal Data Gro up Co mpany
Fo ster City, CA ✦ Chic ago , IL ✦ Indianapo lis, IN ✦ New Yo rk, NY

XML™ Bible
Published by

IDG Books Worldwide, Inc.
An Internatio nal Data Gro up Co mpany
919 E. Hillsdale Blvd., Suite 400
Fo ster City, CA 94404
www.idgbooks.com (IDG Bo o ks Wo rldwide Web site)
Co pyright © 1999 IDG Bo o ks Wo rldwide, Inc . All rights
reserved. No part o f this bo o k, inc luding interio r
design, c o ver design, and ic o ns, may be repro duc ed o r
transmitted in any fo rm, by any means (elec tro nic ,

pho to c o pying, rec o rding, o r o therwise) witho ut the
prio r written permissio n o f the publisher.
ISBN: 0-7645-3236-7
Printed in the United States o f Americ a
10 9 8 7 6 5 4 3 2 1
1O/ QV/ QY/ ZZ/ FC
Distributed in the United States by IDG Bo o ks
Wo rldwide, Inc .
Distributed by CDG Bo o ks Canada Inc . fo r Canada; by
Transwo rld Publishers Limited in the United Kingdo m;
by IDG No rge Bo o ks fo r No rway; by IDG Sweden Bo o ks
fo r Sweden; by IDG Bo o ks Australia Publishing
Co rpo ratio n Pty. Ltd. fo r Australia and New Zealand; by
TransQuest Publishers Pte Ltd. fo r Singapo re,
Malaysia, Thailand, Indo nesia, and Ho ng Ko ng; by
Go to p Info rmatio n Inc . fo r Taiwan; by ICG Muse, Inc .
fo r Japan; by No rma Co munic ac io nes S.A. fo r
Co lo mbia; by Interso ft fo r So uth Afric a; by Eyro lles fo r
Franc e; by Internatio nal Tho mso n Publishing fo r
Germany, Austria and Switzerland; by Distribuido ra

Cuspide fo r Argentina; by Livraria Cultura fo r Brazil; by
Edic io nes ZETA S.C.R. Ltda. fo r Peru; by WS Co mputer
Publishing Co rpo ratio n, Inc ., fo r the Philippines; by
Co ntempo ranea de Edic io nes fo r Venezuela; by
Express Co mputer Distributo rs fo r the Caribbean and
West Indies; by Mic ro nesia Media Distributo r, Inc . fo r
Mic ro nesia; by Grupo Edito rial No rma S.A. fo r
Guatemala; by Chips Co mputado ras S.A. de C.V. fo r
Mexic o ; by Edito rial No rma de Panama S.A. fo r
Panama; by Americ an Bo o ksho ps fo r Finland.
Autho rized Sales Agent: Antho ny Rudkin Asso c iates fo r
the Middle East and No rth Afric a.

Fo r general info rmatio n o n IDG Bo o ks Wo rldwide’s
bo o ks in the U.S., please c all o ur Co nsumer Custo mer
Servic e department at 800-762-2974. Fo r reseller
info rmatio n, inc luding disc o unts and premium sales,
please c all o ur Reseller Custo mer Servic e department
at 800-434-3422.
Fo r info rmatio n o n where to purc hase IDG Bo o ks

Wo rldwide’s bo o ks o utside the U.S., please c o ntac t o ur
Internatio nal Sales department at 317-596-5530 o r fax
317-596-5692.
Fo r c o nsumer info rmatio n o n fo reign language
translatio ns, please c o ntac t o ur Custo mer Servic e
department at 800-434-3422, fax 317-596-5692, o r e-mail
[email protected].
Fo r info rmatio n o n lic ensing fo reign o r do mestic rights,
please pho ne +1-650-655-3109.
Fo r sales inquiries and spec ial pric es fo r bulk
quantities, please c o ntac t o ur Sales department at
650-655-3200 o r write to the address abo ve.
Fo r info rmatio n o n using IDG Bo o ks Wo rldwide’s bo o ks
in the c lassro o m o r fo r o rdering examinatio n c o pies,
please c o ntac t o ur Educ atio nal Sales department at
800-434-2086 o r fax 317-596-5499.
Fo r press review c o pies, autho r interviews, o r o ther
public ity info rmatio n, please c o ntac t o ur Public
Relatio ns department at 650-655-3000 o r fax
650-655-3299.

Fo r autho rizatio n to pho to c o py items fo r c o rpo rate,
perso nal, o r educ atio nal use, please c o ntac t Co pyright
Clearanc e Center, 222 Ro sewo o d Drive, Danvers, MA
01923, o r fax 978-750-4470.
Library o f Co ngress Catalo ging-in-Public atio n Data
Haro ld, Ellio te Rusty.
XML bible / Ellio te Rusty Haro ld.
p.

c m.

ISBN 0-7645-3236-7 (alk. paper)
1. XML (Do c ument markup language) I. Title.
QA76.76.H94H34
005.7’2--dc 21

1999

99-31021
CIP


LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND AUTHOR HAVE USED THEIR BEST
EFFORTS IN PREPARING THIS BOOK. THE PUBLISHER AND AUTHOR MAKE NO REPRESENTATIONS OR
WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF THIS BOOK
AND SPECIFICALLY DISCLAIM ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A
PARTICULAR PURPOSE. THERE ARE NO WARRANTIES WHICH EXTEND BEYOND THE DESCRIPTIONS
CONTAINED IN THIS PARAGRAPH. NO WARRANTY MAY BE CREATED OR EXTENDED BY SALES
REPRESENTATIVES OR WRITTEN SALES MATERIALS. THE ACCURACY AND COMPLETENESS OF THE
INFORMATION PROVIDED HEREIN AND THE OPINIONS STATED HEREIN ARE NOT GUARANTEED OR
WARRANTED TO PRODUCE ANY PARTICULAR RESULTS, AND THE ADVICE AND STRATEGIES CONTAINED
HEREIN MAY NOT BE SUITABLE FOR EVERY INDIVIDUAL. NEITHER THE PUBLISHER NOR AUTHOR SHALL
BE LIABLE FOR ANY LOSS OF PROFIT OR ANY OTHER COMMERCIAL DAMAGES, INCLUDING BUT NOT
LIMITED TO SPECIAL, INCIDENTAL, CONSEQUENTIAL, OR OTHER DAMAGES.
Trademarks: All brand names and pro duc t names used in this bo o k are trade names, servic e marks, trademarks,
o r registered trademarks o f their respec tive o wners. IDG Bo o ks Wo rldwide is no t asso c iated with any pro duc t o r
vendo r mentio ned in this bo o k.
is a registered trademark o r trademark under exc lusive lic ense
to IDG Bo o ks Wo rldwide, Inc . fro m Internatio nal Data Gro up, Inc .
in the United States and/ o r o ther c o untries.


Welcome to the world of IDG Books Worldwide.
IDG Books Worldwide, Inc., is a subsidiary of International Data Group, the world’s largest publisher of
computer-related information and the leading global provider of information services on information technology.
IDG was founded more than 30 years ago by Patrick J. McGovern and now employs more than 9,000 people
worldwide. IDG publishes more than 290 computer publications in over 75 countries. More than 90 million
people read one or more IDG publications each month.
Launched in 1990, IDG Books Worldwide is today the #1 publisher of best-selling computer books in the
United States. We are proud to have received eight awards from the Computer Press Association in recognition
of editorial excellence and three from Computer Currents’ First Annual Readers’ Choice Awards. Our bestselling ...For Dummies® series has more than 50 million copies in print with translations in 31 languages. IDG
Books Worldwide, through a joint venture with IDG’s Hi-Tech Beijing, became the first U.S. publisher to
publish a computer book in the People’s Republic of China. In record time, IDG Books Worldwide has become
the first choice for millions of readers around the world who want to learn how to better manage their
businesses.
Our mission is simple: Every one of our books is designed to bring extra value and skill-building instructions
to the reader. Our books are written by experts who understand and care about our readers. The knowledge
base of our editorial staff comes from years of experience in publishing, education, and journalism —
experience we use to produce books to carry us into the new millennium. In short, we care about books, so
we attract the best people. We devote special attention to details such as audience, interior design, use of
icons, and illustrations. And because we use an efficient process of authoring, editing, and desktop publishing
our books electronically, we can spend more time ensuring superior content and less time on the technicalities

of making books.
You can count on our commitment to deliver high-quality books at competitive prices on topics you want
to read about. At IDG Books Worldwide, we continue in the IDG tradition of delivering quality for more than
30 years. You’ll find no better book on a subject than one from IDG Books Worldwide.

John Kilcullen
Chairman and CEO
IDG Books Worldwide, Inc.

Eighth Annual
Computer Press
Awards
1992

Ninth Annual
Computer Press
Awards
1993

Steven Berkowitz

President and Publisher
IDG Books Worldwide, Inc.

Tenth Annual
Computer Press
Awards
1994

Eleventh Annual
Computer Press
Awards
1995

IDG is the world’s leading IT media, research and exposition company. Founded in 1964, IDG had 1997 revenues of $2.05
billion and has more than 9,000 employees worldwide. IDG offers the widest range of media options that reach IT buyers
in 75 countries representing 95% of worldwide IT spending. IDG’s diverse product and services portfolio spans six key areas
including print publishing, online publishing, expositions and conferences, market research, education and training, and
global marketing services. More than 90 million people read one or more of IDG’s 290 magazines and newspapers, including
IDG’s leading global brands — Computerworld, PC World, Network World, Macworld and the Channel World family of
publications. IDG Books Worldwide is one of the fastest-growing computer book publishers in the world, with more than

700 titles in 36 languages. The “...For Dummies® ” series alone has more than 50 million copies in print. IDG offers online
users the largest network of technology-specific Web sites around the world through IDG.net (http://www.idg.net), which
comprises more than 225 targeted Web sites in 55 countries worldwide. International Data Corporation (IDC) is the world’s
largest provider of information technology data, analysis and consulting, with research centers in over 41 countries and more
than 400 research analysts worldwide. IDG World Expo is a leading producer of more than 168 globally branded conferences
and expositions in 35 countries including E3 (Electronic Entertainment Expo), Macworld Expo, ComNet, Windows World
Expo, ICE (Internet Commerce Expo), Agenda, DEMO, and Spotlight. IDG’s training subsidiary, ExecuTrain, is the world’s
largest computer training company, with more than 230 locations worldwide and 785 training courses. IDG Marketing
Services helps industry-leading IT companies build international brand recognition by developing global integrated marketing
programs via IDG’s print, online and exposition products worldwide. Further information about the company can be found
at www.idg.com.
1/24/99

Credits
Acquisitions Editor

Copy Editors

Jo hn Osbo rn


Terri Varveris

Amy Eo ff
Amanda Kaufman
Nic o le LeClerc
Vic to ria Lee

Contributing Writer

Production

Heather Williamso n

IDG Bo o ks Wo rldwide Pro duc tio n

Technical Editor

Proofreading and Indexing

Greg Guntle


Yo rk Pro duc tio n Servic es

Development Editor

About the Author
Ellio tte Rusty Haro ld is an internatio nally respec ted writer, pro grammer, and
educ ato r bo th o n the Internet and o ff. He go t his start by writing FAQ lists fo r the
Mac into sh newsgro ups o n Usenet, and has sinc e branc hed o ut into bo o ks, Web
sites, and newsletters. He lec tures abo ut Java and o bjec t-o riented pro gramming
at Po lytec hnic University in Bro o klyn. His Cafe c o n Lec he Web site at http://
metalab.unc.edu/xml/ has bec o me o ne o f the mo st po pular independent XML
sites o n the Internet.
Ellio tte is o riginally fro m New Orleans where he returns perio dic ally in searc h o f
a dec ent bo wl o f gumbo . Ho wever, he c urrently resides in the Pro spec t Heights
neighbo rho o d o f Bro o klyn with his wife Beth and c ats Charm (named after the
quark) and Marjo rie (named after his mo ther-in-law). When no t writing bo o ks, he
enjo ys wo rking o n genealo gy, mathematic s, and quantum mec hanic s. His previo us
bo o ks inc lude The Java De ve lo pe r’s Re so urce , Java Ne two rk Pro gramming, Java
Se cre ts, JavaBe ans, XML: Exte nsible Markup Language , and Java I/O .

Fo r Ma, a gre at grandmo the r

Preface
Welc o me to the XML Bible . After reading this bo o k I ho pe yo u’ll agree with me that
XML is the mo st exc iting develo pment o n the Internet sinc e Java, and that it makes
Web site develo pment easier, mo re pro duc tive, and mo re fun.
This bo o k is yo ur intro duc tio n to the exc iting and fast gro wing wo rld o f XML. In this
bo o k, yo u’ll learn ho w to write do c uments in XML and ho w to use style sheets to
c o nvert tho se do c uments into HTML so legac y bro wsers c an read them. Yo u’ll
also learn ho w to use do c ument type definitio ns (DTDs) to desc ribe and validate
do c uments. This will bec o me inc reasingly impo rtant as mo re and mo re bro wsers like
Mo zilla and Internet Explo rer 5.0 pro vide native suppo rt fo r XML.

About You the Reader
Unlike mo st o ther XML bo o ks o n the market, the XML Bible c o vers XML no t fro m
the perspec tive o f a so ftware develo per, but rather that o f a Web-page autho r. I
do n’t spend a lo t o f time disc ussing BNF grammars o r parsing element trees.
Instead, I sho w yo u ho w yo u c an use XML and existing to o ls to day to mo re
effic iently pro duc e attrac tive, exc iting, easy-to -use, easy-to -maintain Web sites
that keep yo ur readers c o ming bac k fo r mo re.
This bo o k is aimed direc tly at Web-site develo pers. I assume yo u want to use XML
to pro duc e Web sites that are diffic ult to impo ssible to c reate with raw HTML. Yo u’ll
be amazed to disc o ver that in c o njunc tio n with style sheets and a few free to o ls,
XML enables yo u to do things that previo usly required either c usto m so ftware
c o sting hundreds to tho usands o f do llars per develo per, o r extensive kno wledge
o f pro gramming languages like Perl. No ne o f the so ftware in this bo o k will c o st
yo u mo re than a few minutes o f do wnlo ad time. No ne o f the tric ks require any
pro gramming.

What You Need to Know
XML do es build o n HTML and the underlying infrastruc ture o f the Internet. To that
end, I will assume yo u kno w ho w to use ftp files, send email, and lo ad URLs in yo ur
Web bro wser o f c ho ic e. I will also assume yo u have a reaso nable kno wledge o f
HTML at abo ut the level suppo rted by Netsc ape 1.1. On the o ther hand, when I
disc uss newer aspec ts o f HTML that are no t yet in widespread use like c asc ading
style sheets, I will c o ver them in depth.

x

Preface

To be mo re spec ific , in this bo o k I assume that yo u c an:

✦ Write a basic HTML page inc luding links, images, and text using a text edito r.
✦ Plac e that page o n a Web server.
On the o ther hand, I do no t assume that yo u:

✦ Kno w SGML. In fac t, this prefac e is almo st the o nly plac e in the entire bo o k
yo u’ll see the wo rd SGML used. XML is suppo sed to be simpler and mo re
widespread than SGML. It c an’t be that if yo u have to learn SGML first.

✦ Are a pro grammer, whether o f Java, Perl, C, o r so me o ther language, XML is
a markup language, no t a pro gramming language. Yo u do n’t need to be a
pro grammer to write XML do c uments.

What You’ll Learn
This bo o k has o ne primary go al; to teac h yo u to write XML do c uments fo r the Web.
Fo rtunately, XML has a dec idedly flat learning c urve, muc h like HTML (and unlike
SGML). As yo u learn a little yo u c an do a little. As yo u learn a little mo re, yo u c an do
a little mo re. Thus the c hapters in this bo o k build steadily o n eac h o ther. They are
meant to be read in sequenc e. Alo ng the way yo u’ll learn:

✦ Ho w an XML do c ument is c reated and delivered to readers.
✦ Ho w semantic tagging makes XML do c uments easier to maintain and develo p
than their HTML equivalents.

✦ Ho w to po st XML do c uments o n Web servers in a fo rm everyo ne c an read.
✦ Ho w to make sure yo ur XML is well-fo rmed.
✦ Ho w to use internatio nal c harac ters like _ and _ in yo ur do c uments.
✦ Ho w to validate do c uments with DTDs.
✦ Ho w to use entities to build large do c uments fro m smaller parts.
✦ Ho w attributes desc ribe data.
✦ Ho w to wo rk with no n-XML data.
✦ Ho w to fo rmat yo ur do c uments with CSS and XSL style sheets.
✦ Ho w to c o nnec t do c uments with XLinks and Xpo inters.
✦ Ho w to merge different XML vo c abularies with namespac es.
✦ Ho w to write metadata fo r Web pages using RDF.

Preface

In the final sec tio n o f this bo o k, yo u’ll see several prac tic al examples o f XML being
used fo r real-wo rld applic atio ns inc luding:

✦ Web Site Design
✦ Push
✦ Vec to r Graphic s
✦ Genealo gy

How the Book Is Organized
This bo o k is divided into five parts and inc ludes three appendixes:
I.

Intro duc ing XML

II.

Do c ument Type Definitio ns

III.

Style Languages

IV.

Supplemental Tec hno lo gies

V.

XML Applic atio ns

By the time yo u’re finished reading this bo o k, yo u’ll be ready to use XML to c reate
c o mpelling Web pages. The five parts and the appendixes are desc ribed belo w.

Part I: Introducing XM L
Part I c o nsists o f Chapters 1 thro ugh 7. It begins with the histo ry and theo ry behind
XML, the go als XML is trying to ac hieve, and sho ws yo u ho w the different piec es o f
the XML equatio n fit to gether to c reate and deliver do c uments to readers. Yo u’ll see
several c o mpelling examples o f XML applic atio ns to give yo u so me idea o f the wide
applic ability o f XML, inc luding the Vec to r Markup Language (VML), the Reso urc e
Desc riptio n Framewo rk (RDF), the Mathematic al Markup Language (MathML), the
Extensible Fo rms Desc riptio n Language (XFDL), and many o thers. Then yo u’ll learn
by example ho w to write XML do c uments with tags yo u define that make sense fo r
yo ur do c ument. Yo u’ll see ho w to edit them in a text edito r, attac h style sheets to
them, and lo ad them into a Web bro wser like Internet Explo rer 5.0 o r Mo zilla. Yo u’ll
even learn ho w yo u c an write XML do c uments in languages o ther than English,
even languages that aren’t written remo tely like English, suc h as Chinese, Hebrew,
and Russian.

xi

xii

Preface

Part II: Document Type Definitions
Part II c o nsists o f Chapters 8 thro ugh 11, all o f whic h fo c us o n do c ument type
definitio ns (DTDs). An XML do c ument may o ptio nally c o ntain a DTD that spec ifies
whic h elements are and are no t allo wed in an XML do c ument. The DTD spec ifies
the exac t c o ntext and struc ture o f tho se elements. A validating parser c an read a
do c ument and c o mpare it to its DTD, and repo rt any mistakes it finds. This enables
do c ument autho rs to make sure that their wo rk meets any nec essary c riteria.
In Part II, yo u’ll learn ho w to attac h a DTD to a do c ument, ho w to validate yo ur
do c uments against their DTDs, and ho w to write yo ur o wn DTDs that so lve yo ur
o wn pro blems. Yo u’l learn the syntax fo r dec laring elements, attributes, entities,
and no tatio ns. Yo u’ll see ho w yo u c an use entity dec laratio ns and entity referenc es
to build bo th a do c ument and its DTD fro m multiple, independent piec es. This
allo ws yo u to make lo ng, hard-to -fo llo w do c uments muc h simpler by separating
them into related mo dules and c o mpo nents. And yo u’ll learn ho w to integrate o ther
fo rms o f data like raw text and GIF image files in yo ur XML do c ument.

Part III: Style Languages
Part III c o nsists o f Chapters 12 thro ugh 15. XML markup o nly spec ifies what’s in a
do c ument. Unlike HTML, it do es no t say anything abo ut what that c o ntent sho uld
lo o k like. Info rmatio n abo ut an XML do c ument’s appearanc e when printed, viewed
in a Web bro wser, o r o therwise displayed is sto red in a style sheet. Different style
sheets c an be used fo r the same do c ument. Yo u might, fo r instanc e, want to use a
style sheet that spec ifies small fo nts fo r printing, ano ther o ne that uses larger fo nts
fo r o n-sc reen use, and a third with abso lutely humo ngo us fo nts to pro jec t the
do c ument o n a wall at a seminar. Yo u c an c hange the appearanc e o f an XML do c ument by c ho o sing a different style sheet witho ut to uc hing the do c ument itself.
Part III desc ribes in detail the two style sheet languanges in bro adest use o n the
Web, Casc ading Style Sheets (CSS) and the Extensible Style Language (XSL).
CSS is a simple style-sheet language o riginally designed fo r use with HTML. CSS
exists in two versio ns: CSS Level 1 and CSS Level 2. CSS Level 1 pro vides basic
info rmatio n abo ut fo nts, c o lo r, po sitio ning, and text pro perties, and is reaso nably
well suppo rted by c urrent Web bro wsers fo r HTML and XML. CSS Level 2 is a mo re
rec ent standard that adds suppo rt fo r aural style sheets, user interfac e styles,
internatio nal and bi-direc tio nal text, and mo re. CSS is a relatively simple standard
that spplies fixed style rules to the c o ntents o f partic ular elements.
XSL, by c o ntrast, is a mo re c o mplic ated and mo re po werful style language that c anno t
o nly apply styles to the c o ntents o f elements but c an also rearrange elements, add
bo ilerplate text, and transfo rm do c uments in almo st arbitrary ways. XSL is divided
into two parts: a transfo rmatio n language fo r c o nverting XML trees to alternative
trees, and a fo rmatting language fo r spec ifying the appearanc e o f the elements o f an
XML tree. Currently, the transfo rmatio n language is better suppo rted by mo st to o ls

Preface

than the fo rmatting language. No netheless, it is beginning to firm up, and is suppo rted
by Mic ro so ft Internet Explo rer 5.0 and so me third-party fo rmatting engines.

Part IV: Supplemental Technologies
Part IV c o nsists o f Chapters 16 thro ugh 19. It intro duc es so me XML-based languages
and syntaxes that layer o n to p o f basic XML. XLinks pro vides multi-direc tio nal
hypertext links that are far mo re po werful than the simple HTML tag. XPo inters
intro duc e a new syntax yo u c an attac h to the end o f URLs to link no t o nly to partic ular do c uments, but to partic ular parts o f partic ular do c uments. Namespac es use
prefixes and URLs to disambiguate c o nflic ting XML markup languages. The Reso urc e
Desc riptio n Framewo rk (RDF) is an XML applic atio n used to embed meta-data in
XML and HTML do c uments. Meta-data is info rmatio n abo ut a do c ument, suc h as the
autho r, date, and title o f a wo rk, rather than the wo rk itself. All o f these c an be added
to yo ur o wn XML-based markup languages to extend their po wer and utility.

Part V: XM L Applications
Part V, whic h c o nsists o f Chapters 20–23, sho ws yo u fo ur prac tic al uses o f XML in
different do mains. XHTML is a refo rmulatio n o f HTML 4.0 as valid XML. Mic ro so ft’s
Channel Definitio n Fo rmat (CDF), is an XML-based markup language fo r defining
c hannels that c an push updated Web site c o ntent to subsc ribers. The Vec to r
Markup Language (VML) is an XML applic atio n fo r sc alable graphic s used by Mic ro so ft Offic e 2000 and Internet Explo rer 5.0. Finally, a c o mpletely new applic atio n is
develo ped fo r genealo gic al data to sho w yo u no t just ho w to use XML tags, but why
and when to c ho o se them.

Appendixes
This bo o k has two appendixes, whic h fo c us o n the fo rmal spec ific atio ns fo r XML, as
o ppo sed to the mo re info rmal desc riptio n o f it used thro ugho ut the rest o f the
bo o k. Appendix A pro vides detailed explanatio ns o f three individual parts o f the
XML 1.0 spec ific atio n: XML BNF grammar, well-fo rmedness c o nstraints, and the
validity c o nstraints. Appendix B c o ntains the o ffic ial W3C XML 1.0 spec ific atio n
published by the W3C. The bo o k also has a third appendix, Appendix C, whic h
desc ribes the c o ntents o f the CD-ROM that ac c o mpanies this bo o k.

What You Need
To make the best use o f this bo o k and XML, yo u need:

✦ A PC running Windo ws 95, Windo ws 98, o r Windo ws NT
✦ Internet Explo rer 5.0
✦ A Java 1.1 o r later virtual mac hine

xiii

xiv

Preface

Any system that c an run Windo ws will suffic e. In this bo o k, I mo stly assume yo u’re
using Windo ws 95 o r NT 4.0 o r later. As a lo ngtime Mac and Unix user, I so mewhat
regret this. Like Java, XML is suppo sed to be platfo rm independent. Also like Java,
the reality is so mewhat sho rt o f the hype. Altho ugh XML c o de is pure text that c an
be written with any edito r, many o f the to o ls are c urrently o nly available o n
Windo ws.
Ho wever, altho ugh there aren’t many Unix o r Mac into sh native XML pro grams,
there are an inc reasing number o f XML pro grams written in Java. If yo u have a Java
1.1 o r later virtual mac hine o n yo ur platfo rm o f c ho ic e, yo u sho uld be able to make
do . Even if yo u c an’t lo ad yo ur XML do c uments direc tly into a Web bro wser, yo u
c an still c o nvert them to XML do c uments and view tho se. When Mo zilla is released,
it sho uld pro vide the best XML bro wser yet ac ro ss multiple platfo rms.

How to Use This Book
This bo o k is designed to be read mo re o r less c o ver to c o ver. Eac h c hapter builds
o n the material in the previo us c hapters in a fairly predic table fashio n. Of c o urse,
yo u’re always welc o me to skim o ver material that’s already familiar to yo u. I also
ho pe yo u’ll sto p alo ng the way to try o ut so me o f the examples and to write so me
XML do c uments o f yo ur o wn. It’s impo rtant to learn no t just by reading, but also by
do ing. Befo re yo u get started, I’d like to make a c o uple o f no tes abo ut grammatic al
c o nventio ns used in this bo o k.
Unlike HTML, XML is c ase sensitive. is no t the same as o r
. The father element is no t the same as the Father element o r the
FATHER element. Unfo rtunately, c ase-sensitive markup languages have an anno ying
habit o f c o nflic ting with standard English usage. On rare o c c asio n this means
that yo u may enc o unter sentenc es that do n’t begin with a c apital letter. Mo re
c o mmo nly, yo u’ll see c apitalizatio n used in the middle o f a sentenc e where yo u
wo uldn’t no rmally expec t it. Please do n’t get to o bo thered by this. All XML and
HTML c o de used in this bo o k is plac ed in a mo no spac ed fo nt, so mo st o f the time
it will be o bvio us fro m the c o ntext what is meant.
I have also ado pted the British c o nventio n o f o nly plac ing punc tuatio n inside quo te
marks when it belo ngs with the material quo ted. Frankly, altho ugh I learned to write
in the Americ an educ atio nal system, I find the British system is far mo re lo gic al,
espec ially when dealing with so urc e c o de where the differenc e between a c o mma
o r a perio d and no punc tuatio n at all c an make the differenc e between perfec tly
c o rrec t and perfec tly inc o rrec t c o de.

Preface

What the Icons M ean
Thro ugho ut the bo o k, I’ve used ico ns in the left margin to c all yo ur attentio n to
po ints that are partic ularly impo rtant.
Note

Note icons provide supplem ental inform ation about the subject at hand, but generally som ething that isn’t quite the m ain idea. Notes are often used to elaborate
on a detailed technical point.

Tip

Tip icons indicate a m ore efficient w ay of doing som ething, or a technique that
m ay not be obvious.

On the
CD-ROM

CD-ROM icons tell you that softw are discussed in the book is available on the
com panion CD-ROM. This icon also tells you if a longer exam ple, discussed but
not included in its entirety in the book, is on the CD-ROM.
Caution icons w arn you of a com m on m isconception or that a procedure doesn’t
alw ays w ork quite like it’s supposed to. The m ost com m on purpose of a Caution
icon in this book is to point out the difference betw een w hat a specification says
should happen, and w hat actually does.

Caution

CrossReference

The Cross Reference icon refers you to other chapters that have m ore to say about
a particular subject.

About the Companion CD-ROM
The inside bac k c o ver o f this bo o k c o ntains a CD-ROM that ho lds all numbered
c o de listings that yo u’ll find in the text. It also inc ludes many lo nger examples that
c o uldn’t fit into this bo o k. The CD-ROM also c o ntains the c o mplete text o f vario us
XML spec ific atio ns in HTML. (So me o f the spec ific atio ns will be in o ther fo rmats as
well.) Finally, yo u will find an asso rtment o f useful so ftware fo r wo rking with XML
do c uments. Many (tho ugh no t all) o f these pro grams are written in Java, so they’ll
run o n any system with a reaso nably c o mpatible Java 1.1 o r later virtual mac hine.
Mo st o f the pro grams that aren’t written in Java are designed fo r Windo ws 95, 98,
and NT.
Fo r a c o mplete desc riptio n o f the CD-ROM c o ntents, yo u c an read Appendix C. In
additio n, to get a c o mplete desc riptio n o f what is o n the CD-ROM, yo u c an lo ad the
file index.html o nto yo ur Web bro wser. The files o n the c o mpanio n CD-ROM are no t
c o mpressed, so yo u c an ac c ess them direc tly fro m the CD.

xv

xvi

Preface

Reach Out
The publisher and I want yo ur feedbac k. After yo u have had a c hanc e to use this
bo o k, please take a mo ment to c o mplete the IDG Bo o ks Wo rldwide Registratio n
Card (in the bac k o f the bo o k). Please be ho nest in yo ur evaluatio n. If yo u tho ught a
partic ular c hapter didn’t tell yo u eno ugh, let me kno w. Of c o urse, I wo uld prefer to
rec eive c o mments like: “This is the best bo o k I’ve ever read”, “Thanks to this bo o k,
my Web site wo n Co o l Site o f the Year”, o r “When I was reading this bo o k o n the
beac h, I was besieged by mo dels who tho ught I was super c o o l”, but I’ll take any
c o mments I c an get :-).
Feel free to send me spec ific questio ns regarding the material in this bo o k. I’ll do
my best to help yo u o ut and answer yo ur questio ns, but I c an’t guarantee a reply.
The best way to reac h me is by email:

[email protected]
Also , I invite yo u to visit my Cafe c o n Lec he Web site at http://metalab.unc.
edu/xml/, whic h c o ntains a lo t o f XML-related material and is updated almo st
daily. Despite my persistent effo rts to make this bo o k perfec t, so me erro rs have
do ubtless slipped by. Even mo re c ertainly, so me o f the material disc ussed here
will c hange o ver time. I’ll po st any nec essary updates and errata o n my Web site at
http://metalab.unc.edu/xml/books/bible/. Please let me kno w via email o f
any erro rs that yo u find that aren’t already listed.
Ellio tte Rusty Haro ld
[email protected]
http://metalab.unc.edu/xml/
New Yo rk City, June 1999

Acknowledgments
The fo lks at IDG have all been great. The ac quisitio ns edito r, Jo hn Osbo rn, deserves
spec ial thanks fo r arranging the unusual sc heduling this bo o k required to hit the
mo ving target XML presents. Terri Varveris shepherded this bo o k thro ugh the
develo pment pro c ess. With po ise and grac e, she managed the c o nstantly shifting
o utline and sc hedule that a bo o k based o n unstable spec ific atio ns and so ftware
requires. Amy Eo ff c o rrec ted many o f my grammatic al sho rtc o mings. Susan Parini
and Ritc hie Durdin, the pro duc tio n c o o rdinato rs, also deserve spec ial thanks fo r
managing the pro duc tio n o f this bo o k and fo r dealing with last-minute figure
c hanges.
Steven Champeo n bro ught his SGML experienc e to the bo o k, and pro vided many
insightful c o mments o n the text. My bro ther Tho mas Haro ld put his c o mmand
o f c hemistry at my dispo sal when I was trying to grasp the Chemic al Markup
Language. Carro ll Bellau pro vided me with parts o f my family tree, whic h yo u’ll
find in Chapter 17.
I also greatly apprec iate all the c o mments, questio ns, and c o rrec tio ns sent in by
readers o f my previo us bo o k, XML: Exte nsible Markup Language . I ho pe that I’ve
managed to address mo st o f tho se c o mments in this bo o k. They’ve definitely
helped make XML Bible a better bo o k. Partic ular thanks are due to Alan Esenther
and Do nald Lanc o n Jr. fo r their espec ially detailed c o mments.
WandaJane Phillips wro te the o riginal versio n o f Chapter 21 o n CDF that is adapted
here. Heather Williamso n, in additio n to perfo rming yeo man-like servic e as tec hnic al
edito r, wro te Chapter 13, CSS Le ve l 2, and parts o f Chapters 18, 19, and 22. Her help
was instrumental in helping me almo st meet my deadline. (Blame fo r this almo st
rests o n my sho ulders, no t theirs.) Also , I wo uld like to thank Piro z Mo hseni, who
also served as a tec hnic al edito r fo r this bo o k.
The agenting talents o f David and Sherry Ro gelberg o f the Studio B Literary Agenc y
( http://www.studiob.com/) have made it po ssible fo r me to write mo re o r less
full-time. I rec o mmend them highly to anyo ne thinking abo ut writing c o mputer
bo o ks. And as always, thanks go to my wife Beth fo r her endless lo ve and
understanding.

Contents at a Glance
Prefac e ................................................................................................................................ix
Ac kno wledgments ..........................................................................................................xvii

Part I: Introducing XM L ......................................................................................1
Chapter 1: An Eagle’s Eye View o f XML ..........................................................................3
Chapter 2: An Intro duc tio n to XML Applic atio ns ........................................................17
Chapter 3: Yo ur First XML Do c ument ..........................................................................49
Chapter 4: Struc turing Data ............................................................................................59
Chapter 5: Attributes, Empty Tags, and XSL ................................................................95
Chapter 6: Well-Fo rmed XML Do c uments
Chapter 7: Fo reign Languages and No n-Ro man Text ................................................161

Part II: Document Type Definitions ............................................................189
Chapter 8: Do c ument Type Definitio ns and Validity ................................................191
Chapter 9: Entities and External DTD Subsets ..........................................................247
Chapter 10: Attribute Dec laratio ns in DTDs ..............................................................283
Chapter 11: Embedding No n-XML Data ......................................................................307

Part III: Style Languages................................................................................321
Chapter 12: Casc ading Style Sheets Level 1 ..............................................................323
Chapter 13: Casc ading Style Sheets Level 2 ..............................................................389
Chapter 14: XSL Transfo rmatio ns ................................................................................433
Chapter 15: XSL Fo rmatting Objec ts ..........................................................................513

Part IV: Supplemental Technologies ..........................................................569
Chapter 16: XLinks ........................................................................................................571
Chapter 17: XPo inters ..................................................................................................591
Chapter 18: Namespac es ..............................................................................................617
Chapter 19: The Reso urc e Desc riptio n Framewo rk ..................................................631

PartV: XM L Applications ................................................................................655
Chapter 20: Reading Do c ument Type Definitio ns ......................................................657
Chapter 21: Pushing Web Sites with CDF ....................................................................775
Chapter 22: The Vec to r Markup Language ................................................................805
Chapter 23: Designing a New XML Applic atio n ..........................................................833

xx

Contents at a Glance

Appendix A: XML Referenc e Material ........................................................................863
Appendix B: The XML 1.0 Spec ific atio n ......................................................................921
Appendix C: What’s o n the CD-ROM ............................................................................971
Index ................................................................................................................................975
End-User Lic ense Agreement ......................................................................................1018
CD-ROM Installatio n Instruc tio ns ..............................................................................1022

Contents
Prefac e ................................................................................................................................ix
Ac kno wledgments ..........................................................................................................xvii

Part I: Introducing XM L

1

Chapter 1: An Eagle’s Eye View of XM L ........................................................3
What Is XML? ............................................................................................................3
XML Is a Meta-Markup Language .................................................................3
XML Desc ribes Struc ture and Semantic s, No t Fo rmatting ........................4
Why Are Develo pers Exc ited abo ut XML? ............................................................6
Design o f Do main-Spec ific Markup Languages ...........................................6
Self-Desc ribing Data .......................................................................................6
Interc hange o f Data Amo ng Applic atio ns ....................................................7
Struc tured and Integrated Data ....................................................................8
The Life o f an XML Do c ument ................................................................................8
Edito rs .............................................................................................................9
Parsers and Pro c esso rs .................................................................................9
Bro wsers and Other To o ls ............................................................................9
The Pro c ess Summarized ............................................................................10
Related Tec hno lo gies ............................................................................................10
Hypertext Markup Language ......................................................................10
Casc ading Style Sheets ................................................................................11
Extensible Style Language ...........................................................................12
URLs and URIs ..............................................................................................12
XLinks and XPo inters ...................................................................................13
The Unic o de Charac ter Set .........................................................................14
Ho w the Tec hno lo gies Fit To gether ...........................................................14

Chapter 2: An Introduction to XM L Applications ......................................17
What Is an XML Applic atio n? ................................................................................17
Chemic al Markup Language ........................................................................18
Mathematic al Markup Language ................................................................19
Channel Definitio n Fo rmat ..........................................................................22
Classic Literature .........................................................................................22
Sync hro nized Multimedia Integratio n Language ......................................24
HTML+TIME ..................................................................................................25
Open So ftware Desc riptio n .........................................................................26
Sc alable Vec to r Graphic s ............................................................................27
Vec to r Markup Language .............................................................................29
Music ML ........................................................................................................30
Vo xML ............................................................................................................32

xxii

Contents

Open Financ ial Exc hange ............................................................................34
Extensible Fo rms Desc riptio n Language ...................................................36
Human Reso urc es Markup Language ........................................................38
Reso urc e Desc riptio n Framewo rk ..............................................................40
XML fo r XML ...........................................................................................................42
XSL .................................................................................................................42
XLL .................................................................................................................43
DCD ................................................................................................................43
Behind-the-Sc ene Uses o f XML .............................................................................44

Chapter 3: Your First XM L Document ..........................................................49
Hello XML ................................................................................................................49
Creating a Simple XML Do c ument ..............................................................50
Saving the XML File ......................................................................................50
Lo ading the XML File into a Web Bro wser ................................................51
Explo ring the Simple XML Do c ument ..................................................................52
Assigning Meaning to XML Tags ...........................................................................54
Writing a Style Sheet fo r an XML Do c ument .......................................................55
Attac hing a Style Sheet to an XML Do c ument ....................................................56

Chapter 4: Structuring Data ..........................................................................59
Examining the Data ................................................................................................59
Batters ...........................................................................................................60
Pitc hers ..........................................................................................................62
Organizatio n o f the XML Data .....................................................................62
XMLizing the Data ..................................................................................................65
Starting the Do c ument: XML Dec laratio n and Ro o t Element .................65
XMLizing League, Divisio n, and Team Data ..............................................67
XMLizing Player Data ...................................................................................69
XMLizing Player Statistic s ...........................................................................70
Putting the XML Do c ument Bac k To gether Again ....................................72
The Advantages o f the XML Fo rmat ...................................................................80
Preparing a Style Sheet fo r Do c ument Display ...................................................81
Linking to a Style Sheet ...............................................................................82
Assigning Style Rules to the Ro o t Element ...............................................84
Assigning Style Rules to Titles ....................................................................85
Assigning Style Rules to Player
and Statistic s Elements ...........................................................................88
Summing Up ..................................................................................................89

Chapter 5: Attributes, Empty Tags, and XSL ..............................................95
Attributes ................................................................................................................95
Attributes versus Elements ................................................................................101
Struc tured Meta-data .................................................................................102
Meta-Meta-Data ...........................................................................................105
What’s Yo ur Meta-data Is So meo ne Else’s Data ......................................106
Elements Are Mo re Extensible ..................................................................106
Go o d Times to Use Attributes ..................................................................107

Contents

Empty Tags ............................................................................................................108
XSL .........................................................................................................................109
XSL Style Sheet Templates ........................................................................110
The Bo dy o f the Do c ument .......................................................................111
The Title ......................................................................................................113
Leagues, Divisio ns, and Teams .................................................................115
Players .........................................................................................................120
Separatio n o f Pitc hers and Batters ..........................................................122
CSS o r XSL? .................................................................................................130

Chapter 6: Well-Formed XM L Documents ................................................133
#1: The XML dec laratio n must begin the do c ument ..............................144
#2: Use Bo th Start and End Tags in No n-Empty Tags .............................144

Chapter 7: Foreign Languages and Non-Roman Text ............................161
No n-Ro man Sc ripts o n the Web .........................................................................161
Sc ripts, Charac ter Sets, Fo nts, and Glyphs ......................................................166
A Charac ter Set fo r the Sc ript ...................................................................166
A Fo nt fo r the Charac ter Set .....................................................................167
An Input Metho d fo r the Charac ter Set ...................................................167
Operating System and Applic atio n So ftware ..........................................168
Legac y Charac ter Sets .........................................................................................169
The ASCII Charac ter Set ............................................................................169
The ISO Charac ter Sets ..............................................................................172
The Mac Ro man Charac ter Set ..................................................................175
The Windo ws ANSI Charac ter Set ............................................................176
The Unic o de Charac ter Set .................................................................................177
UTF 8 ............................................................................................................182
The Universal Charac ter System ..............................................................182
Ho w to Write XML in Unic o de ............................................................................183
Inserting Charac ters in XML Files with Charac ter Referenc es .............183
Co nverting to and fro m Unic o de ..............................................................184
Ho w to Write XML in Other Charac ter Sets ............................................185

Part II: Document Type Definitions

189

Chapter 8: Document Type Definitions and Validity ..............................191
Do c ument Type Definitio ns ................................................................................191
Do c ument Type Dec laratio ns .............................................................................192
Validating Against a DTD .....................................................................................195
Listing the Elements ............................................................................................201
Element Dec laratio ns ...........................................................................................208
ANY ..............................................................................................................209
#PCDATA ......................................................................................................209
Child Lists ....................................................................................................212
Sequenc es ...................................................................................................214
One o r Mo re Children ................................................................................215

xxiii