Why the LaTeX Community
Should Care about SGML

William F. Hammond

https://www.albany.edu/~hammond/

TUG 2020
via Zoom
July 26, 2020

ABSTRACT

Given that the universal format-to-format translator Pandoc is coming of age, LaTeX authors are tempted to think that whatever LaTeX they write can quickly be translated without worry to whatever other format may be required.

Of course, that is not exactly true, but the use of an XML profile of LaTeX can make it exactly true. However, an SGML profile of LaTeX can provide closer emulation of classical LaTeX than an XML profile.

Most actors in the world of markup have restricted their use of SGML to XML. For that reason software that handles SGML beyond the realm of XML seems to be falling out of maintenance. If the LaTeX community wishes to continue to be able to avail itself of the advantages of SGML for LaTeX source emulation, it may fall on the LaTeX community to maintain the extant SGML libraries.

1.  Use the Best Source

2.  The Concept of LaTeX Profile

3.  Remark Aside on Accessibility

4.  Why SGML rather than XML?

5.  Example: A Simple Table

a rather long phrasefiveshorter bit
shorter bita rather long phrasefive

With the GELLMU didactic production system in “regular” mode, a benchmark for LaTeX profiles, there are two different choices of generalized LaTeX markup.

5.1.  Ready for XML

An HTML-style table written in generalized LaTeX:

\begin{display}
\begin{table}\tabarg{|r|c|r|}
    \trule
    \tr\td a rather long phrase \td five \td shorter bit
    \trule
    \tr\td shorter bit \td a rather long phrase \td five
    \trule
\end{table}
\end{display}  

The corresponding XML:

<display>
<table><tabarg><vbr/>r<vbr/>c<vbr/>r<vbr/></tabarg>
<trule/>
<tr><td> a rather long phrase </td><td> five </td><td> shorter bit</td></tr>
<trule/>
<tr><td> shorter bit </td><td> a rather long phrase </td><td> five</td></tr>
<trule
</table>
</display>

5.2.  Closer to classical LaTeX, ready only for SGML

The usual way of writing source for a LaTeX tabular environment uses “&” and “\\” as markdown.

\begin{display}
\begin{tabular}{|r|c|r|}
    \hline
    a rather long phrase & five & shorter bit \\
    \hline
    shorter bit & a rather long phrase & five \\
    \hline
\end{tabular}
\end{display}

The corresponding XML:

<display><tabular><tabuhead><tabharg><vbr/>r<vbr/>c<vbr/>r<vbr/></tabharg>
<hline/></tabuhead>
<tabubody>
<taburow>
  <firstcell>a rather long phrase </firstcell>
  <tabampcell>  five </tabampcell>
  <tabampcell>  shorter bit </tabampcell>
</taburow>
<taburow>
  <firstcell><hline/>shorter bit </firstcell>
  <tabampcell>  a rather long phrase </tabampcell>
  <tabampcell>  five </tabampcell>
</taburow>
<taburow><firstcell><hline/></firstcell></taburow>
</tabubody>
</tabular>
</display>

6.  SGML

7.  The OpenSP Library

8.  SGML may be losing users

9.  The OpenSP library needs maintenance

10.  SGML/XML-based transformations vs. Pandoc