Writing XML

Tutorial
Home

What is
XML?

XML vs.
HTML

Writing
XML

Help
Psychologists?

Resources

Document Type Definition

A Document Type Definition (DTD) is a formal set of grammar which describes the markup (elements) available in any specific type of document. DTDs define a special tag: the document type. Developing a DTD can be a complex task and as such it is a job most suited to specialists.

However, XML has been developed so that it can be used either with or without a DTD. In a DTDless XML application, markup can be invented without being formally defined. A DTDless file in effect defines its own markup informally, by the mere existence and location of elements where they are created. This means that an XML-enabled browser will have no DTD to tell it what to expect so the browser needs to be able to understand the document as it reads it. In order for browsers to understand the XML document, the document must be well-formed.


Well-formed XML

All XML documents, whether they use a DTD or not, must be well-formed. The following is a simple example of well-formed XML.

<?XML version="1.0?">
<PATIENT>
<NAME>John Smith</NAME>
<SEX>Male</SEX>
<AGE>34</AGE>
<COMPLAINT>Frequent headaches and hears voices</COMPLAINT>
</PATIENT>
<HR/>

To have well-formed XML, the code must adhere to the following rules:

  • Declare the XML version at the start.

    In the above example, the first line <?XML version="1.0?"> declares that this is XML 1.0.

  • Tags must be balanced, that is tags that contain content must have opening and closing tags.

    The information "John Smith" is enclosed with the opening tag <NAME> and the closing tag </NAME>.

  • Child markup must nest completely inside parent markup.

    In the example, <NAME></NAME>, <SEX></SEX>, <AGE></AGE>, and <COMPLAINT></COMPLAINT> are all children of the parent <PATIENT></PATIENT>. The child tags and the information they contain are entirely enclosed within the opening and closing tags of their parent.

  • Empty tags (markup that does not have closing tags, as in HTML's <IMG>, <HR>, and <BR> tags) must either end with '/>' or appear to be non-empty tags by addition of an end tag.

    The <HR/> at the end of the example is an empty tag.

Valid XML

The well-formed XML example above represents the simplest XML possible. To have more control and power, valid XML is required.

Valid XML files have and adhere to a DTD. All valid XML files must already be well-formed. The following is the earlier example written as valid XML.

<?XML version="1.0?">
<!DOCTYPE patientinfo [
<!ELEMENT patient (name, sex, age, complaint)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT sex (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT complaint (#PCDATA)>
<!ELEMENT hr EMPTY>
]>
<PATIENT>
<NAME>John Smith</NAME>
<SEX>Male</SEX>
<AGE>34</AGE>
<COMPLAINT>Frequent headaches and hears voices</COMPLAINT>
</PATIENT>
<HR/>

This example uses an internal DTD. DTDs for valid XML files can be internal, external, or both. If there are both internal and external DTDs, the internal one will be processed first.

The first line in the valid XML example is the same as in the well-formed example. Beginning with the second line, some new lines are introduced. This is the internal DTD. This is where the formal definition of all markup in the document is established.

  • The Document Type Definition declaration

    The <!DOCTYPE patientinfo declares that this is a DTD and the DTD's name is "patientinfo". The bracket [ opens the section in which the elements of the DTD are declared.

  • The Element definitions

    The next six lines declare 'elements.' The first element patient is a parent which has its children indicated in a list enclosed by parentheses (name, sex, age, complaint).

    The next four lines define each of patient's children. They each will contain only plain character data and this is indicated with the (#PCDATA) notation.

    The next line defines the element hr as an EMPTY markup, which simply means that as it contains no information, it will not have a closing tag.

  • The DTD closing

    To indicate that the DTD definition is complete, it must be closed with a closing bracket ].

  • The XML document content

    The remainder of the file is the XML document as previously discussed.

That's Not All

There is much more to know about writing XML documents. This tutorial is not meant to be a comprehensive lesson in the full capabilities of the language, but rather an introduction to the basics of XML. To learn more of the complexities and details of writing and using XML, consult the resources page for places to get more information.

Writing XML Review Quiz


Helen's Home Page | Helen's PSY 422 Page