XML Documents
Generic markup was originally designed for documents, such as technical manuals, books, and articles. XML is a direct descendant of the the Standard Generalized Markup Language [SGML], which was released in 1986; SGML, in its turn, was a descendant of the Generalized Markup Language (GML), developed by Charles Goldfarb, Ed Mosher, and Ray Lorie at IBM in the early 1970s, initially for tagging documents in the legal department.[1] This kind of in-line markup traces its way back further to formatting codes for typesetting machines and on to editorial marks on paper copy.
[1] Many people believe that GML stands not for Generalized Markup Language but for Goldfarb, Mosher, and Lorie.
Over time, document markup has become increasingly generic: Codes for type styles, such as "italics," have given way to more general codes, such as "title," that say what text represents rather than how it should be formatted. This pattern occurred not only with XML but also with other document languages, such as TeX and TROFF, which implemented high-level macro packages, such as LaTeX and MS, to hide low-level formatting codes. For example, a LaTeX document often contains no low-level formatting code at all, as in Listing 3-1.
Listing 3-1. LaTeX Markup
\documentclass{article}
\title{Sample document}
\author{David Megginson}
\begin{document}
\maketitle
This is a simple LaTeX document.
\end{document}
That example is not, functionally, much different from a similar document in XML, as in Listing 3-2.
Listing 3-2. XML Markup
Behind the scenes, however, the LaTeX example hides the formatting code inside macro definitions, whereas the XML example has no direct link to formatting at all. Even so, the ideas behind XML and SGML are familiar to people in computer technology, math, and science, who have been working with formats like LaTeX for many years. That fact that HTML was inspired by, but not initially based on, SGML lexical conventions also smoothed the introduction of XML into the documentation world.
It is XML's document origin that explains specialized syntactic features, such as mixed content and CDATA sections, that seem to make computer processing of XML more difficult than it should be, especially in terms of whitespace handling. Although these features cause technical problems, they exist to allow XML to work with human-readable, publishable information, such as books and articles. For machine-readable data (see Chapter 4), simple lists and tables are usually sufficient, as in Listing 3-3.
Listing 3-3. Sample XML Data, Without Mixed Content
This example has a clear distinction between markup and content: Every element contains either text or other XML elements but never both. Documented-oriented XML, on the other hand, tends to be messier, as in Listing 3-4.
Listing 3-4. Sample XML Document with Mixed Content
the rest of the world was already preparing for
II
This second example has no clear distinction: The content of the para element consists of both text and other elements mixed together. The presence of this kind of mixed content is a strong indication that an XML file is intended as a document rather than as a data collection.
These days, XML-encoded documentation is about as common as SGML or LaTeX documentation was before itpeople use XML mainly in large, complex technical documentation systems or small, private research projectsbut documents are no longer the main use for generic markup. Interest in using XML to exchange data and to set up distributed computing (Chapter 5, XML networking) now far exceeds any interest in XML for documentation. Many of the initial XML document-oriented specifications (XLink [XLINK], XPointer [XPOINTER], and XSL-FO [XSL-FO]) now either languish with few users or have been coopted for use with data or networking (XSLT [XSLT], XPath [XPath]), whereas data- or networking-oriented specifications keep on appearing. The world appears to be satisfied with the Hypertext Markup Language [HTML] for online documentation and Microsoft Word for print and is not eager to embrace XML with all its extra complexity.
Obviously, because of that extra complexity, XML is not a general-purpose solution for all documentation projects, but in some situations, using XML for documents makes a lot of sense, particularly when you need to publish and republish large amounts of technical information in multiple formats, combine human-written material with information from databases, or customize publications for individual recipients. This chapter examines both the advantages and the disadvantages of XML documents and introduces some of the special issues involved with XML publishing.
hot info
Langganan:
Posting Komentar (Atom)
0 komentar:
Posting Komentar