See Also: Home Links Personal Site Blogroll  FriendFeed CV

Tags:

Topic Image

XSL-FO and ApacheFOP

In 2004 I attended OSCON in Portland, Oregon and made the following notes after one of the sessions there...

Went to a session this morning on publishing documents using XSL-FO (Formatting Objects) and the Apache-FOP. Presenter was from a hospital who needed a mechanism simple enough for Doctors to use to publish content without them needing any special experience with markup languages or tools. Details follow.

Presenter outlined the first two incarnations of their publishing mechanism which have since been replaced. The first was based on use of MS-Word and Templates/Styles and performing an HTML export using VBA macros. This was apparently a disaster, formatting consistancy was diabolical. They did manage to publish a number of documents but was soon abandoned.

Next approach was to use the Java Swing XML-editor, and use Java XML-binding to stored authored content in XML resource files. The XML would then be transformed into Latex using XSLT, then the Latex files would be rendered to PDF. The system was more usable by authors, had greater data-integrity and data reusability, but Latex rendering was a little inflexible.

Their 3rd and current approach was the one presented in more detail. In this case content is stored in XML but edited in a Jtree (Java) based based XML editor (not open-sourced) authored by the hospital's IT group. The UI is very simple and doctors adopted it easily. The saved XML content is published by rendering into XSL-FO files (using XSLT) which are then processed by the Apache Software Foundaton's (ASF) FOP processor into PDFs.

Apache-FOP is also able to output to raster format (JPG), SVG and Postscript so from the one source you can render to a format suitable for any particular printing platform.

There are a few limitations to ASF's implementation and the XSL-FO ver 1.0 spec itself though. One is that it only supports a single multi-column content blob per page-set. Another is related to spot colours and lack of interchangeable use of CMYK and RGB (or whatever the 3color group is) colours. Another problem is the look-ahead features are kinda brain-dead, for example when laying out tables it doesnt parse the whole table to caluculate how wide to render it so you need to specifiy column widths (if its critical).

XSL-FO ver 1.1 is supposed to fix/enable these, and also supports pretty much all the stuff that you might previously have used Latex for as an intermediate publishing format.

The hospital has gone on to publish many documents using this 3rd evolution of the system with great success. The speaker was confident that Apache-FOP is suitable for production environment despite it still being in a pre-release stage. However he did admit that something like Docbook (said it was Way Cool!!) would basically have been enough for the whole project except that the grammar, complexity and notation used by medical records made it impractical.

Docbook would then seem to be the better fit for use in the University environment for general purpose document authoring, publishing and print or alternative format rendering.


See Also: XML Technologies | Notes Index