Society for Technical Communication logo San Francisco Chapter STC
Newsletter of the Society for Technical Communication, San Francisco Chapter
April/May 2006

February 2006 Meeting -- Single-source Publishing with DocBook XSL
Presented by Bob Stayton and reviewed by Mysti Berry


Bob Stayton's presentation about single source publishing using DocBook XSL was lively and informative -- he had a believable, complete answer for all the questions thrown his way, and we threw him as many as we could. Bob answered our many questions with a depth and breadth of knowledge that only comes from having worked with a technology for years.

DocBook is an OASIS standard DTD (document type definition which describes the structure of a Web page written in XML), related style sheets (documents that describe formatting for a specified set of documents), and the tools needed to process data into delivery formats, created specifically for technical documentation of all kinds. DocBook is based on XML, which stands for Extensible Mark-up Language, a subset of SGML. XML is a hierarchical set of entities which DocBook uses to structure content and store it separately from formatting information, which is managed by style sheets written in XSL, a language created for expressing stylesheets.

Who Should Single-Source?

First, Bob presented information about when to consider single-sourcing with any XML-based system. Companies are usually pleased with the return on investment if they meet these criteria:

In other words, a one-writer company or a company with few documentation deliverables may not need single-sourcing.

Why Should You Use DocBook?

The short answer to this question is -- because it works, and because it is (relatively) cheap. DocBook is a much cheaper system to implement than Arbortext's Epic, for example. Bob explained that DocBook provides a number of advantages:

Bob noted a few negative factors associated with XML-based single-sourcing:

One guest mentioned that DocBook uses thousands of style tags, and Bob admitted that it could be hard for people just starting out to figure out which subset of tags they want to use.

How Does It Work?

  1. Take content, and chunk heavily -- the smaller the topic, the more likely you will be able to re-use it.
     
  2. Design or modify style sheets to meet your company's requirements, and tag content with the appropriate tags, using the tool of your choice, from something as simple as Notepad, to more XML-aware editors like oXygen, Syntext's Serna, Arbortext Editor, or Blast Radius's Xmetal.
     
  3. Take your XML content files and your DocBook style sheet, and feed them to an XSLT processor to produce HTML files. XSLT is a language for transforming XML documents into other XML documents. XSLT processors include xsltproc from Gnome's xmlsoft.org, Saxon 6 from SourceForge, or Xalan Java from the open source Apache XML project.
     
    Alternatively, you can take your XML content files and your DocBook style sheet, and feed them to an XSLT processor to produce an FO file, which you then send to an FO processor, which produces a PDF file. XSL-FO processors include XEP from RenderX, XSL Formatter from Antenna House, and FOP from the open source Apache XML project.
     
  4. If you have the time and money, you can automate your print production: use style sheets to flow content onto pages, create automatic tables of contents and indexes, and create automatic page breaks. If you include this step, then the writer is free from having to do formatting work, and can focus on writing.

DocBook provides standard publishing features such as front matter, graphics support, tables, glossaries, bibliographies, and indexes. It also provides support for special features such as profiling (conditional text), write-it-once modular writing, cross-references, and localization.

Bob described XSL as "a funny little language. It takes a little getting used to." He also said, after the fourth time someone mentioned Arbortext, that it was in a different class of single-sourcing tool both by cost and function.

Visit Bob's Web site, www.sagehill.net, to learn more about single source publishing with DocBook. To learn more about XML, XSL, and XSLT, visit these sites:

Mysti Berry is a Senior Member of the STC, and a Senior Technical Writer for salesforce.com. Visit her site: www.mysti.us.

Copyright © 2006 by the Society for Technical Communication, San Francisco Chapter (www.stc-sf.org). This article may be reprinted in another STC publication under the provisions of the chapter's copyright policy.


| Newsletter Front PageNewsletter HomeSF Chapter ContactsSF Chapter Home PageSTC International |