to the main page... about these slides...

So what is it, really?

From a concrete point of view

An XML document is a (Unicode) text with markup tags and other meta-information.

Markup tags denote elements:
  ...<bla attr="val" ...>...</bla>...
     |    |               | |
     |    |               |
a matching element end tag
     |    |              
the contents of the element
     |    
an attribute with name attr and value val, values enclosed by ' or "
     
an element start tag with name bla

Short-hand notation for empty elements: ...<bla attr="val".../>...

(Note: XML is case sensitive!)

Well-formed documents:

Special characters can be escaped using Unicode character references, example:
    &#38; = &       &#60; = &lt; = <
CDATA Sections are an alternative to escaping in character data, example:
    <![CDATA[<greeting>Hello, world!</greeting>]]>

Note: Parsing is trivial! (in principle)