Sophie

Sophie

distrib > Mandriva > 2010.1 > x86_64 > by-pkgid > 965e33040dd61030a94f0eb89877aee8 > files > 1415

howto-html-en-20080722-2mdv2010.1.noarch.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML
><HEAD
><TITLE
>The DocBook toolchain</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.7"><LINK
REL="HOME"
TITLE="DocBook Demystification HOWTO"
HREF="index.html"><LINK
REL="PREVIOUS"
TITLE="Other DTDs"
HREF="x101.html"><LINK
REL="NEXT"
TITLE="Who are the projects and the players?"
HREF="x168.html"></HEAD
><BODY
CLASS="sect1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
>DocBook Demystification HOWTO</TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="x101.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
></TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="x168.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="sect1"
><H1
CLASS="sect1"
><A
NAME="AEN109"
></A
>6. The DocBook toolchain</H1
><P
>The easiest way to format and render XML-DocBook documents is to
use the <SPAN
CLASS="application"
>xmlto</SPAN
> toolchain.  This ships with
Red Hat; Debian users can get it with the command <B
CLASS="command"
>apt-get
install xmlto</B
>.</P
><P
>Normally, what you'll do to make XHTML from your
DocBook sources will look like this:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;bash$ xmlto xhtml foo.xml
bash$ ls *.html
ar01s02.html ar01s03.html ar01s04.html index.html
</PRE
></FONT
></TD
></TR
></TABLE
><P
>In this example, you converted an XML-Docbook  document named 
<TT
CLASS="filename"
>foo.xml</TT
> with three top-level sections into an
index page and two parts.  Making one big page is just as easy:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;bash$ xmlto xhtml-nochunks foo.xml
bash$ ls *.html
foo.html
</PRE
></FONT
></TD
></TR
></TABLE
><P
>Finally, here is how you make Postscript for printing:</P
><TABLE
BORDER="0"
BGCOLOR="#E0E0E0"
WIDTH="100%"
><TR
><TD
><FONT
COLOR="#000000"
><PRE
CLASS="screen"
>&#13;bash$ xmlto ps foo.xml       # To make Postscript
bash$ ls *.ps
foo.ps
</PRE
></FONT
></TD
></TR
></TABLE
><P
>Some older versions of <B
CLASS="command"
>xmlto</B
> may be 
more verbose, emitting noise like "Coverting to XHTML" and so forth.</P
><P
>To turn your documents into HTML or Postscript, you need an
engine that can apply the combination of DocBook DTD and 
a suitable stylesheet to your document.  Here is how the 
open-source tools for doing this fit together:</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="figure2.png"><DIV
CLASS="caption"
><P
>Present-day XML-DocBook toolchain</P
></DIV
></P
></DIV
><P
>Parsing your document and applying the stylesheet transformation
will be handled by one of three programs.  The most likely one is
<SPAN
CLASS="application"
>xsltproc</SPAN
>,
the parser that ships with Red Hat 7.3 and later versions.  The other
possibilities are two Java programs,
<SPAN
CLASS="application"
>Saxon</SPAN
>
and
<SPAN
CLASS="application"
>Xalan</SPAN
>,</P
><P
>It is relatively easy to generate high-quality XHTML from
DocBook; the fact that XHTML is simply another XML DTD helps a lot.
Translation to HTML is done by applying a rather simple stylesheet,
and that's the end of the story.  RTF is also simple to generate in
this way, and from XHTML or RTF it's easy to generate a flat ASCII
text approximation in a pinch.</P
><P
>The awkward case is print.  Generating high-quality printed
output (which means, in practice, Adobe's
PDF or Portable Document
Format, a packaged form of PostScript) is difficult.  Doing it right
requires algorithmically duplicating the delicate judgments of a human
typesetter moving from content to presentation level.</P
><P
>So, first, a stylesheet translates Docbook's structural markup
into another dialect of XML &#8212;
FO
(Formatting Objects).  FO markup is very much presentation-level; you
can think of it as a sort of XML functional equivalent of troff.  It
has to be translated to Postscript for packaging in a PDF.</P
><P
>In the toolchain shipped with Red Hat, this job is handled by a
TeX macro package called
<SPAN
CLASS="application"
>PassiveTeX</SPAN
>. It
translates the formatting objects generated by
<B
CLASS="command"
>xsltproc</B
> into Donald Knuth's TeX language.  TeX was
one of the earliest open-source projects, an old but powerful
presentation-level formatting language much beloved of mathematicians
(to whom it provides particulaly elaborate facilities for describing
mathematical notation).  TeX is also famously good at basic
typesetting tasks like kerning, line filling, and hyphenating.  TeX's
output, in what's called DVI
(DeVice Independent) format, is then massaged into PDF.</P
><P
>If you think this bucket chain of XML to Tex macros to DVI to
PDF sounds like an awkward kludge, you're right.  It clanks, it
wheezes, and it has ugly warts.  Fonts are a significant problem,
since XML and TeX and PDF have very different models of how fonts
work; also, handling internationalization and localization is a
nightmare. About the only thing this code path has going for it is
that it works.</P
><P
>The elegant way will be
FOP, a direct
FO-to-Postscript translator being developed by the Apache project.
With FOP, the internationalization problem is, if not solved, at least
well confined; XML tools handle Unicode all the way through to FOP.
Glyph to font mapping is also strictly FOP's problem.  The only
trouble with this approach is that it doesn't work &#8212; yet.  As of
August 2002 FOP is in an unfinished alpha state &#8212; usable, but
with rough edges and missing features.</P
><P
>Here is what the FOP toolchain looks like:</P
><DIV
CLASS="mediaobject"
><P
><IMG
SRC="figure3.png"><DIV
CLASS="caption"
><P
>Future XML-DocBook toolchain with FOP.</P
></DIV
></P
></DIV
><P
>FOP has competition.  There is another project called
<SPAN
CLASS="application"
>xsl-fo-proc</SPAN
>
which aims to do the same things as FOP, but in C++ (and therefore
both faster than Java and not relying on the Java environment).  As of
August 2002 <SPAN
CLASS="application"
>xsl-fo-proc</SPAN
> is in an unfinished
alpha state, not as far along as FOP.</P
></DIV
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="x101.html"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.html"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="x168.html"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Other DTDs</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
>&nbsp;</TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Who are the projects and the players?</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>