to the main page about the tutorial  THE XML REVOLUTION  -  TECHNOLOGIES FOR THE FUTURE WEB back up next

A concrete view of XML

An XML document is a (Unicode) text with markup tags and other meta-information.

Markup tags denote elements:

  ...<foo attr="val" ...>...</foo>...
     |    |               | |
     |    |               |
a matching element end tag
     |    |              
the contents of the element
     |    
an attribute with name attr and value val, values enclosed by ' or "
     
an element start tag with name foo

There is a short-hand notation for empty elements: ...<foo attr="val".../>...

An XML document must be well-formed:

Note: XML is case sensitive!

Special characters can be escaped using Unicode character references:

CDATA Sections are an alternative to escaping many characters: The strange syntax is a legacy from SGML...

White-space (blanks, newlines, etc.) is used both for indentation and actual contents. (xml:space attribute provides some control.)

Other meta-information:

<?target data...?>
an instruction for a processor, target identifies the processor for which it is directed, data is a string containing the instruction
<!-- comment -->
a comment, will be ignored by all processors
<!DOCTYPE ...>
document type declaration (described later...)

back COPYRIGHT © 2000-2003 ANDERS MØLLER & MICHAEL I. SCHWARTZBACH next