Intro to DocBook Publishing
If you're considering using the DocBook format as an publishing solution, this article will try to present an overview of the concepts, strengths and weaknesses of DocBook Publishing.
DocBook was designed for technical publications, particularly for computer software, but is ideal for any application where structured documents must be displayed in HTML and/or other formats. What is a structured document? A simple example of a structured document is a book. A book has a more-or-less standard structure, for example: cover page, table of contents, one or more chapters, and an index. Magazine articles, webpages, and many other documents we see every day have a standard structure. When you use DocBook format for authoring documents you work within a mental model of the structure of the document. Putting your documents into a formal structure requires some additional work and some learning of new concepts, but it allows you to easily reformat the document while preserving the meaning of the information within it. (See the Concepts section for more detail.)
Example 1. Section Structure
Using a word processor you could create a portion of a document with the following process:
The result would be something that looks like this:
Using DocBook XML markup you would create a (portion of a) document that looks like this:
<section> <title>Title Text</title> <para>A body paragraph.</para> <para>Another body paragraph.</para> </section>
The problem with the word processor approach is that when you save
the document file there is no information saying that the heading you just
formatted as bold refers to the following two paragraphs. What if there is
another paragraph after that? Is it in the section or is it something else
altogether? After you save the file (and forget your original intent,) you
can't tell that
For example when converting a document to a collection of web pages you may want everything in one section to be on a single webpage (because they are logically related), but you may not want to break pages at a bridgehead. You may want sections to be in the table of contents, but bridgeheads not to be. If all the word processor saves is which font and whether it is bolded this information about the structure of your document is lost. If you've used MS-Word stylesheets, you have an understanding of the issues here. If you are really consistent in your use and naming of MS-Word stylesheets and you use names like "section", "abstract", "epigraph", etc. you have created the equivalent of a structured document using MS-Word.
The DocBook format forces you to think about your document in this way. (Just like your English teacher once made you outline your essays.) In addition to learning tools and tags, you must learn to think of your documents this way. If you are converting existing documents you need to be able to create this type of structure from the original document. Technical writers and lawyers usually create documents this way, no matter what tools they are using. If you aren't comfortable with this approach or your documents are too free-form, then DocBook is probably not for you.
The payoff for this extra thought during the authoring process is fivefold:
There are four major Software components (categories) in the DocBook publishing process: