3.3.2. The web author's perspective

Besides the display of the XML declaration in some web browsers, there is more or less no big difference between HTML and XHTML from the web author's point of view. The major difference is the requirement that the XHTML web pages must be at least well-formed XML documents. However, the XHTML pages should be valid XML documents. What does it mean? An XML document is well-formed if it is structured according to the rules defined in Section 2.1 of the XML 1.0 Recommendation (see [25]). This definition states that elements, delimited by their start and end tags, are nested properly within one another. This is a rule that should also be met by HTML pages. However, if your XHTML pages are not well-formed, you may assume that your pages cannot be processed in a proper way or are rejected by some XML-aware browsers. "An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it. [...] The XML document type declaration contains or points to markup declarations that provide a grammar for a class of documents. This grammar is known as a document type definition, or DTD." [25] So, why should you claim that your XHTML pages use the XHTML DTD's grammar if your pages don't keep them? Your XHTML pages should also be valid XML documents. Valid XML documents are always well-formed. Well-formed XML documents don't need to be valid XML documents.

Therefore, we suggest that your XHTML pages are always valid XML documents.

It should be a smooth migration path for HTML authors to publish their content in XHTML. If the HTML pages were already well-formed then it should be a minor step to fulfil the validity constraints. You neither need special authoring tools nor do you need to learn a new completely set of elements and attributes. In order to meet the validity constraints, you need to introduce the validity check into your publishing process. This validity check doesn't depend on any browser. You can use any XML validator. These can process your XHTML pages either locally or remotely. See Chapter "Testing" which validators and testing tools you might use to test your pages. What's the advantage for you to publish your web content in XHTML? If you like to publish your web site in several document types then you need to have one basic format from which other formats can be created. XSL Transformations (XSLT, see [26]) provides such a technique to define filters and reformatting rules to be applied to XML documents. Since XHTML pages are XML documents, such XSLT style sheets can be applied to XHTML pages. XHTML can be the basic format for your web site. This holds true as long as your XHTML pages would contain all necessary information for all other formats. Then you don't need to reformat your content for the fully featured web browsers because they are able to render XHTML pages. Otherwise, you need another basic format like XML.

Copyright © 2001-2003 by Rainer Hillebrand and Thomas Wierlemann