The GELLMU Manual

William F. Hammond

Dept. of Mathematics & Statistics University at Albany Albany, New York 12222 (USA)

Email: hammond@math.albany.edu

ABSTRACT

This is the manual for Generalized Extensible LaTeX-Like Markup (GELLMU). The central focus in the GELLMU project is to tie LaTeX to the worlds of SGML and XML by providing LaTeX-like markup for writing documents under SGML and XML vocabularies (formally known as document types).

The Manual explains the distinction between basic and advanced use, provides a description of regular GELLMU as an instance of advanced GELLMU, and discusses the use of the didactic production system, which is the project's suite of processors for working with regular GELLMU.

The Manual also deals with the metacommands available when writing GELLMU markup. One of these metacommands is the project's emulation of LaTeX's newcommand, which makes it possible to have macros taking multiple arguments while writing for an SGML or XML vocabulary.

1  Introduction... *
2  Basic GELLMU... *
3  Metacommands... *
3.1  \documenttype... *
3.2  \newcommand... *
3.3  \begin{} … \end{}... *
3.4  \macro and \Macro... *
3.5  Macro-Level Fronting of SGML Element Names... *
4.1  Illustrations.... *
4.2  Multiple Argument/Option Syntax... *
4.3  Limitation in Regard to XML... *
5  The Didactic Document Type... *
5.1  Suggestions and Caveats... *
5.2  Markup Fundamentals... *
5.2.1  Explicitly named commands.... *
5.2.2  Certain single characters.... *
5.2.3  Certain strings... *
5.3  Large Structure... *
5.4  Sectioning... *
5.5  Labels, References, and Anchors... *
5.5.1  Labels and Sequencing... *
5.5.2  Anchors... *
5.5.3  Example Emulating a LaTeX Counter... *
5.6  General Usage for Sectional Units... *
5.6.1  The Content Model... *
5.6.2  The LaTeX-Like Form of General Usage... *
5.7  verbatim, verblist, and manmac... *
5.8  Accents... *
5.9  Tabular Environment Emulation... *
5.10  Graphic Inclusions... *
6  Mathematics in article... *
6.1  General... *
6.2  Assertions... *
6.2.1  Examples... *
6.2.2  Usage for assertion... *
6.3  Equations and Equation Arrays... *
6.4  The \mathsym Metacommand... *
7  The Didactic Production System... *
7.1  Permission... *
7.2  Materials... *
7.3  Other Required Software... *
7.4  Using the syntactic translator... *
7.4.1  Operation in Batch Mode... *
7.4.2  Interactive Operation... *
7.4.3  Interrupting the Syntactic Translator... *
7.5  Using the didactic production system... *
7.5.1  Staged Design... *
7.5.2  Default Staging... *
7.5.3  Parsing with nsgmls... *
7.5.4  Processing with sgmlspl... *
7.5.5  Environmental Variables... *
Appendix A  The GELLMU Archive... *
Appendix B  Release Notes... *
B.1  Comments on the Syntactic Translator... *
B.2  Comments on the Didactic Production System... *
B.2.1  Internationalization... *
B.2.2  Document Type Definitions... *
B.2.3  Translation to XML... *
B.2.4  Translation to HTML... *
B.2.5  Translation to LaTeX... *
B.2.6  Future Plans... *

1.  Introduction

GELLMU is an acronym for “Generalized Extensible LaTeX-Like MarkUp”, which is the author's concept for using LaTeX-like markup to write consciously for SGML document types such as HTML, DocBook, TEI, or GELLMU's own didactic LaTeX-like document type called article.

It evolved from earlier thought about delineation of a coherent subset of LaTeX commands with the property that if a LaTeX document used only those commands then it could be translated with full reliability to other formats including HTML so that documents could be prepared both for print and for the web from a single source.

Problems with this early thought during the years 1996–1997 included the fact that there did not seem to be a community of LaTeX users willing to focus on a narrow vocabulary and the fact then of a legacy practice that mixed LaTeX commands freely with non-LaTeX TeX commands.

The present idea was crystalized in the summer of 1998 while the author was looking at Ulrich Vieth's LaTeX markup for the TeX Directory System (TDS) specification from the TeX User Group (TUG), which now is physically realized in TUG's TeXLive series of TeX-related software distributions on CDrom. The HTML version of that specification was derived through an intermediate ad hoc translation from LaTeX to Texinfo, the language of the GNU Documentation System, which is a robust hypertex system pre-dating HTML driven by TeX the Program.

In thinking about generalizing Vieth's ad hoc translation, which used GNU Emacs Lisp (Elisp), one of the most widely available free cross-platform programming languages for which there is a free robust engine, the same engine that underlies the interactive editing interface of Emacs, the author realized that the structure of Texinfo is very much like that of an authoring level SGML document type. From that idea it was a small step to decide that one might profitably write Elisp code to use LaTeX-like markup for the conscious writing of a new LaTeX-like article document type.


Software associated with the GELLMU project falls into two parts:

1. The syntactic translator. This is a purely syntactic layer for converting LaTeX-like markup in configurable ways to SGML markup. Its output may be handled in standard SGML or XML systems.

2. The didactic production system. This component is a sketch, which might usefully serve as a base for further development, of an SGML production system for an authoring environment that consists of

1. An SGML document type called article that is accompanied by a corresponding XML document type.

2. A package of extensible translating utilities written in the language Perl.

As its name suggests, these materials are didactic and should be regarded as unfinished for production work.

There are two overall concepts: basic mode and advanced mode. The basic mode may be used to write consciously for any standard document type such as HTML, DocBook, or TEI, and the syntactic translator is for that mode the only software under this project that might be relevant. The advanced mode incorporates a configurable broader array of LaTeX-like markup features, mostly for brevity, in the syntactic translator. This mode is fully developed only for use with a LaTeX-like document type such as the didactic article document type that is part of the didactic production system.

One may use any SGML or XML processing framework in working with the article document type. The didactic production system includes what is needed to produce both HTML and regular LaTeX forms of an article instance. Consequently, one is able to produce both DVI using the LaTeX format for TeX, the program, and PDF using the LaTeX format for PDFTeX, the program. Moreover, one may tune the PDF in various ways using small alterations of the Perl code for translating the XML version of article to regular LaTeX.

2.  Basic GELLMU

Neither the basic nor the advanced mode involves in any way adoption of the language of LaTeX. (But many command names under the didactic article document type, mimic LaTeX command names.) There are two fundamental ideas:

1. A LaTeX-like command “\foo” corresponds to an SGML element “<foo>”.

2. The syntactic translator is almost entirely ignorant of vocabulary and a name like foo need not have meaning in it although it must have meaning in the document type for which one is consciously writing.1

To use the basic mode one must be familiar with the SGML document type for which one is writing. Ordinary HTML is an example. Very few of the features in the advanced markup not also part of the basic markup2 make any sense for use in the direct preparation of HTML with LaTeX-like input.

GELLMU uses LaTeX special characters such as ‘\’, ‘{’, and ‘}’ along with LaTeX-like argument/option syntax, where braces immediately following a command name indicate command arguments and square brackets, i.e., ‘[’ and ‘]’, indicate command options. A command corresponds to an SGML element, and in basic mode a command may have at most one argument, the content of which corresponds to SGML element content, and at most one option, the content of which corresponds to a list of SGML attribute specifications. Thus for example, in basic mode for HTML one may use the markup

 \a[href="http://www.w3.org/"]{The World Wide Web Consortium}

to form the HTML anchor:

 The World Wide Web Consortium .

(The formation of anchors with the didactic article document type in advanced mode is slightly more complicated because the characters ‘=’ and ‘/’, which may acquire special (and “overloaded”) semantic significance in mathematical contexts, are held for delayed evaluation as empty elements and because the syntactic translator, which does not recognize command names, regards this usage in advanced mode as multiple argument/option syntax (section 4.2), which is not part of basic mode.)

An example of the distinction between basic and advanced GELLMU is that in advanced mode it is possible and easy to arrange to have a blank line, as in LaTeX, represent the beginning of a paragraph. In basic mode for HTML one must3 use “\p” to begin a paragraph, and for the XML version of HTML one must also provide markup for the end of every paragraph, which may be done in several ways.

For some of the details on using the basic markup with HTML see Using the GELLMU Syntactic Translator to Write HTML. It will be instructive to have the parallel source markup available at the same time.

3.  Metacommands

A metacommand is a LaTeX-like command that does not correspond to an SGML element. Each metacommand is handled internally by the syntactic translator.

3.1.  \documenttype

A document prepared in GELLMU source usually begins with a documenttype command. For example,

 \documenttype{html}

is used to begin a document prepared for the most common form of classical HTML.

The syntactic translator has two public variables gellmu-doctype-keylist and gellmu-doctype-info, which are Elisp associative lists, that enable one to match XML or SGML “<!DOCTYPE ... >” declarations with LaTeX-like

 \documenttype[my-optional-key]{my-doctype}

commands, where my-optional-key is available to override a default key for my-doctype. Thus, for example, “\documenttype{html}” points to the default key for “html”, which is “html-4.01” and which points to the W3C HTML 4.01 Transitional document type, while

 \documenttype[xhtml-1.0s]{html}

indicates W3C XHTML 1.0 Strict.

A user may configure these variables without modifying the source code for the GELLMU syntactic translator, but minimal knowledge of Elisp will be required. A future release might provide a configuration file for this purpose.

A second option for the documenttype metacommand, which must follow the single required argument, is provided for writing an internal declaration subset. The contents of an internal declaration subset constructed this way may be any internal declaration subset material. However, some care is required for entering characters that are special. To ease the handling of special characters four metacommands have been provided for use inside the internal declaration subset:

For example, a user who wishes to be able in source to use “&quo;” to reference the ASCII quotation mark when writing consciously in basic mode for TEI.2 would begin the source file with:

\documenttype{TEI.2}[
\entity{quo "&#x22;"}
]

3.2.  \newcommand


The general construction of a newcommand definition is

 \newcommand{name}[nargs][first]{value}

where name is the name of the newcommand, nargs optionally specifies the number of its arguments, first is an optionally-provided default value for the first argument, and value denotes the value string.


Example. In writing HTML one might use

\newcommand{\href}[2][http://www.w3.org/]{\a[href="#1"]{#2}}

for brevity in writing many HTML anchors. With this definition the invocation

 \href{http://www.myweb.mydomain/me.html}{my web page}

gives rise to the HTML markup

 my web page

while the invocation “\href{Web Central}” gives rise to

 Web Central .

Rules.

1. The name of a newcommand may not be referenced in its value string.5


Failure to observe these rules may cause the syntactic translator to enter an infinite recursive loop. If a user suspects this may have happened, then the invocation of the syntactic translator should be interrupted (section 7.4.3).

3.3.  \begin{} … \end{}

These provide emulation of LaTeX environment notation without actually providing anything that is not otherwise available. Markup which resembles that for a LaTeX environment simply resolves to an SGML element. This usage may be convenient for SGML elements of large scope such as, for example, the body of an HTML document.

With advanced mode the special form

 \begin{document} . . . \end{document}

may be used to emulate the corresponding feature of LaTeX for a document type, such as the didactic article document type, that in the large consists of a preamble and a body.

3.4.  \macro and \Macro

Use of these is discouraged in the absence of a need. One situation that presents a need is name “fronting”: see the discussion below in section 3.5. Please note that in most situations one may use newcommand without an argument for simple macro substitutions.


There are four primary differences between macro and Macro, on the one hand, and newcommand, on the other hand.

1. Neither macro nor Macro can be used to define a macro that takes arguments.

2. The name of a newcommand must consist of word characters, but there is no restriction on the characters, apart from unbalanced brace characters (‘{’ and ‘}’), that may appear in the name field of a macro or Macro metacommand.

3. If the name of a macro or Macro does not begin with the command sequence introducer, i.e., the character ‘\’, then an invocation of that metacommand is given by every forward match of its name. The use of such names is strongly discouraged because document segments can then become opaque much too easily.

4. A newcommand invocation, absent the use of a semi-colon for termination, is only effective at the whole word level — with word here denoting a maximal string of successive word characters — whereas macro and Macro invocations are effective regardless of word boundaries.


Human authors using either macro or Macro may find unanticipated interactions between the three forms of macro substitution.

Unbalanced brace characters, i.e., the characters ‘{’ and ‘}’, may not be used in the name field or the value field of any form of macro metacommand.

3.5.  Macro-Level Fronting of SGML Element Names

The word fronting as used here describes the practice of modifying the meaning of an SGML element name by using one or more of the macro facilities to generate usage of the same name as an element combined with other markup using the syntax that would otherwise correspond to basic use of the element.

Suppose, for example, one wants all paragraphs in HTML (marked with \p in GELLMU source) to be placed in a (style) class called custom.

Recommended procedure: Create a new unique name and then use macro to front it.

\newcommand{\cp}[1][]{\p[class="custom"]{#1}}
\macro{\p}{\cp}


 \p[class="custom"]{...} .

This will not intercept the alternate, otherwise nearly equivalent, markup given with \begin{p}\end{p} since newcommand is based on simple macro substitution and does not operate at the level of namespaces.

The idea with advanced GELLMU is that for SGML document types sharing structural characteristics with LaTeX one might wish to have the syntactic translator provide LaTeX-like markup syntax beyond the level used with basic GELLMU and that these additional layers of syntax should be configurable. The only substantial realization of this program so far is the case of the GELLMU didactic document type called article. The specifics of that realization are discussed in the following section.

4.1.  Illustrations.

One might want to be able to use blank lines, as in LaTeX, for introducing new paragraphs in a document type that provides paragraphs.

In some article-level document types each sectional unit has a unit header providing markup for various, often optional, unit descriptors. It is convenient to be able to use LaTeX-like multiple argument/option syntax (section 4.2) for these.

If the document type provides authoring-level mathematical markup beyond inclusions of the World Wide Web Consortium's Mathematical Markup Language (MathML) under its XML namespace regime, then one might want to be able to use the ‘$’ character to toggle in and out of inline math, and if the document provides for math displays, then one might want to use, as in LaTeX, the strings “$” and “$” as delimiters for unstructured mathematical displays, and markup such as  $$. . .$$ for a single equation, and markup such as  \begin{eqnarray} . . . \end{eqnarray} for a list of equations. It is important to emphasize, however, that by overall system design the syntactic translator operates without substantial knowledge of vocabulary. While it is true that if ‘$’ is to provide a toggle for an inline element containing math, say, tmath, then the syntactic translator needs to have that association made, but the association is provided as the value of a configuration variable in the syntactic translator that can be changed between documents so that the syntactic translator may be used with many document types.

One way to make such configuration convenient is to use an array of Elisp functions that are fronts with various variable configuration packages for the basic function gellmu-trans in the syntactic translator.

The general outline for advanced GELLMU with arbitrary document types is not fully developed in the present release. Instead the project has concentrated on the realization of these ideas for the project's didactic article document type, which is the subject of the next section.

As the syntactic translator stands now, basic mode is characterized in the syntactic translator by the true setting for its Boolean variable gellmu-straight-sgml, while the configuration used by default for the didactic document type (which could be handled in basic mode with more verbose source markup) has that variable set false and also the variable gellmu-regular-sgml set false.

The term regular GELLMU refers to use of the syntactic translator with the default configuration for the didactic document type. It involves nearly maximal emulation of LaTeX-like markup; it implies both advanced mode and the didactic document type (section 5) article.

4.2.  Multiple Argument/Option Syntax

An essential point in the present design is that the whole system is built from components, each of which has its own function6. Consistent with this design the syntactic translator operates with knowledge of syntax but little or no knowledge of language.

Multiple argument/option syntax has been built into advanced mode as part of the overall idea of providing, where sensible, LaTeX-like features in a precise user markup interface for writing in document types under SGML and XML.

What are the rules for converting the multiple argument/option syntax in source markup into SGML? Direct conversion by the syntactic translator of this type of usage into XML is not available because such conversion requires some language knowledge and the program does not operate with knowledge of language at that level7. One obtains an XML version of a document in the didactic production system by using a translator with minimal knowledge of the command vocabulary to create the XML version from an SGML version that is the immediate output of the syntactic translator.

In multiple argument/option syntax, which is much like that of LaTeX, arguments and options follow command names. Arguments are delimited by braces, i.e., ‘{’ and ‘}’ and options by square brackets, i.e., ‘[’ and ‘]’. There must be no white space between the arguments and options nor between the command name and the first member of an argument/option sequence.

Each command with a multiple argument/option sequence is translated to an open tag whose name is the name of the command. Each argument is translated to an ag0 element and each option to an op0 element. (Both ag0 and op0 lie in GELLMU's reserved name space.) There are two exceptional cases.

1. The first argument or option is an option inside which the very first character is a colon, i.e., ‘:’. This is the method provided in advanced mode for the direct entry of an SGML attribute sequence.8 The entire contents of the option string, apart from the leading ‘:’, which is discarded, are understood to be a sequence of SGML attributes for the SGML element whose name is the name of the command. There is no syntax check of the attribute contents by the syntactic translator. Such an attribute option is not treated as an op0 element. In particular, an attribute option is correctly followed immediately by a semi-colon, i.e., the character ‘;’, if and only if the corresponding SGML element is a defined-empty element under the SGML document type. Since SGML attributes correspond to very little of classical LaTeX9, attribute options are seldom used10 in the didactic production system. One such use is for the GELLMU equivalent of latex's equation* and eqnarray* environments, which is marked up this way:

[:nonum="true"]
e = mc^2
to produce:
 e  =  mc^{2}
2. The first argument is the only argument and there are no options apart from a possible attribute option. This case, which is extremely common, is exceptional relative to argument/option handling since the sole argument simply becomes element content without an ag0 wrapper.

When a command has a multiple argument/option sequence, the question arises whether the ag0 and op0 elements that arise from the arguments and options are the only content of the element corresponding to the command. The syntax does not provide a way to determine this. On the other hand, the SGML document type definition does provide information that indicates whether other content is possible. It is beyond the scope of the design of the syntactic translator for the syntactic translator to read a document type definition. The syntactic translator does, however, have a configurable list variable gellmu-autoclose-list that contains the names of elements for which no content beyond the elements arising from arguments and options is possible. While it is not necessary that every such command be entered in this list, when such a command not in the list is not explicitly followed by an element closing command, it is possible in some instances for an SGML parser to infer incorrectly the location of end of the element. Thus, the didactic production system provides a command anch for making anchors. The document type definition provides for one option, a reference, and one argument, the anchored text.11 Because the syntactic translator does not consider the document type definition, if one enters the markup

 \anch[href="http://www.w3.org/"]{W3C} HQ ,

unless the name anch is in the list12 gellmu-autoclose-list, an SGML parser will not have reason to close the anch until it sees the space following the anchored text “W3C”, and so that space will be considered insignificant white space with the result that there will be no space between the anchored text and the following “HQ”.

4.3.  Limitation in Regard to XML

A final general comment about advanced mode is that the features it can provide beyond basic mode when one is writing consciously for an XML document type are somewhat limited. For example, blank lines cannot easily be made adequate for paragraph markup with the XML form of the didactic article document type. Although it is not a specific limitation for future editions of the syntactic translator, the vision is that use of advanced mode will be specifically for a somewhat rich SGML version of a document type.

5.  The Didactic Document Type

The didactic document type is the document type underlying what is called regular GELLMU. It is the heart of the idea of GELLMU as a bridge for authors from LaTeX to the world of XML. More specifically, the bridge is from the world of a LaTeX article to a document type in the world of XML, also called article, that has a structure and a vocabulary similar to those of the LaTeX document class.13 The techniques used in the didactic production system are extensible and can be carried over to other types of documents than articles. It is important to note that there are many features in regular LaTeX which have no analogue so far in the development of this project. One might hope to get an idea of the extent of coverage by reviewing the examples in the project archive (appendix A).

When an author prepares a document as a LaTeX article, the document is being marked up as data for a specific typesetting program: Donald Knuth's program TeX running with the main LaTeX facilities loaded.

When an author prepares a document as GELLMU source, the syntactic translator provides a LaTeX-like markup interface, but its output is not data for a specific typesetting program. Rather it is data for a broad class of processors. This means that multiple output formats can be obtained from a single source without the need for human intervention because XML provides a framework that makes it relatively easy to create reliable programs for translation from an XML document type to other formats. It offers, moreover, the possibility of translation to future formats free of any need for human intervention once translators from the original document type to such formats are written. The small price one pays for this advantage in moving from LaTeX markup to GELLMU markup is that the author must learn a few new things.14

There are two formal constructions of the didactic document type. The name of the document type is article. The first construction is an SGML version of article that provides features convenient for authors that are not available under XML. The second is an XML version. For most non-technical purposes the two constructions should be regarded as equivalent.

The SGML construction of an article is derived from GELLMU source markup for a document by using the syntactic translator. The didactic document type is accompanied by a translator implemented under the Perl language framework sgmlspl by David Megginson (see the release notes in appendix B for more information) for converting the SGML version of an article to the XML version.

The description in this section of the manual deals primarily with source level markup for the didactic document type and with how it is handled 15 by the time the XML version of an article is generated. Secondarily there is comment on how it is rendered in the chief output formats of the didactic production system, which are regular HTML16 with math rendering facilitated by MathJax17, PDF, XHTML+MathML, and terminal window HTML (for limited screens).

A quick glance at the flowchart (section 7.5.2) shows that the first XML stage — author-level XML — may be viewed as a second entry point to didactic production system processing. Some day this could become a reasonable formatting route for translations from things like Texinfo, DocBook, and, even perhaps, classical LaTeX itself via a processor such as tex4ht.

5.1.  Suggestions and Caveats

Although this is the manual for a software release, it is not a book. A document of book size would be required for a full description of the didactic production system.

Much of the markup vocabulary is copied from LaTeX. There are some instances where there is some deviation from LaTeX usage, and many of those instances are mentioned here.

Definitive information about the didactic document type may be derived by consulting the document type definition. Because the didactic production system is conceived as a base for future development there are sketches in the document type definition that are not covered by the didactic processors. For example, although there is sketched code for the analogues of LaTeX's paragraph and subparagraph commands, which are sectional in nature, that is found in the translation from SGML to XML, there is no sketched code for these elements in the two formatters.

Another way to obtain information about the didactic production system is by studying examples including this manual and the examples in the project archive (appendix A).

5.2.  Markup Fundamentals

There are several kinds of commands:

5.2.1.  Explicitly named commands.

Apart from macro level metacommands an explicitly named command begins with a maximal string introduced by the character\followed by word characters, including the numerals0’, ‘1’, …, ‘9’. The notion of word character depends on one's locale, a concept that is formalized in GNU Emacs. In the ASCII character set the word characters are the 52 upper and lower case letters and the 10 numerals. The first numeral, if any, must not be ‘0 since such names are reserved for use by the syntactic translator. Command names are case sensitive.

An explicitly named command is a container, corresponding to an SGML element, if its name is immediately followed, without intervening white space, by the character ‘{’. In that case the delimited zone of containment normally ends with the subsequent balancing character ‘}’. (LaTeX-like multiple argument/option (section 4.2) chains deserve more discussion; for now it will suffice to point out that the use of the \anch command in this document for making “anchors” is an example, and, of course, LaTeX's \frac command is another example. For the present discussion these commands are considered to be containers.)

An explicitly named command corresponds to an SGML defined-empty element if its name is immediately followed, without intervening white space, by the character ‘;’.

An explicitly named command corresponds to an SGML element closing tag if its name is immediately followed, without intervening white space, by the character ‘:’.

The name of an explicitly named command is terminated by a non-word character. There is a small, possibly acceptable, level of syntactic ambiguity unless the name terminator is one of ‘{’, ‘;’, ‘:’, or ‘[’.

In basic mode if the name terminator is ‘[’, then that character introduces a list of SGML attribute specifications, each of the form name="value", and the list must be terminated by the character ‘]’. Then if the following character is ‘{’, the named command is a container that ends with the balancing ‘}’. Otherwise the following character may be ‘;’ if the named command is a defined-empty element and must be so in that case for direct editing of an XML document type.

In advanced mode if the name terminator is ‘[’, then that character introduces a LaTeX-like command option — part of the emulation of LaTeX's multiple argument/option syntax (section 4.2) — unless it is immediately followed, without intervening white space, by the character ‘:’, in which case the bracketed content is a list of SGML attribute specifications. (The initial ‘:’, which may be used optionally in basic mode, is discarded.)

In any other case there is some syntactical ambiguity. The syntactic translator will produce a corresponding SGML open tag unless the logical variable gellmu-xml-strict has been set.18 If the usage is consistent with the structure of the document type, an SGML parser will in many cases be able to handle the result correctly. The result of this type of syntactic ambiguity in source markup is not tolerated if one is editing directly for an XML document type. The terminator can be a blank space, but, if so, the blank space is likely to become invisible after SGML parsing much in the way that in LaTeX the markup

 \LaTeX document

will be collapsed into the single word form “LaTeXdocument” when typeset.19

5.2.2.  Certain single characters.

its sequence of convergents always has a limit.
\end{Thm}

to obtain:

Theorem 6.2.2.1.   If [ n_{1}, n_{2}, … ] is an infinite continued fraction, then its sequence of convergents always has a limit.

Notice that the asstid argument is merging the reference value for (the new) series XXseries with the visible id of the current sectional unit.

6.3.  Equations and Equation Arrays

The general usage for equation is the following:

The options key and series represent the same things as the corresponding label (section 5.5.1) options. In order to use the series option, a key option, which may simply be empty, must be present. The use of “equation*” as a name for an equation display that is not numbered is not permitted, but one may instead use the nonum attribute as follows:

 $$[:nonum="true"] . . .$$ .

General usage for an equation array (name eqnarray) is:

where the content is an eqnabody consisting of eqnrow's, each row may be terminated in LaTeX-like markup, as in LaTeX, with the string “\\” and consists of three parts, corresponding to elements eqnleft, eqncenter, and eqnright, that may be separated in LaTeX-like markup with the character ‘&’.

Support for numbering in eqnarrays is only minimally developed although there is suggestive sketching in the didactic document type that is not supported in the formatters. By default the equations in equation arrays are numbered consecutively throughout an article. This behavior can be altered by using the series attribute of an eqnarray. If that is done, then, as things have been sketched, numbering applies to whole arrays rather than to the equations within arrays. Numbering in an equation array may be suppressed by setting its attribute nonum to the string “true”.

6.4.  The \mathsym Metacommand

mathsym is a macro substitution metacommand that is available in the didactic production system for enabling an author to declare that a macro name represents a mathematical symbol. It is a formal way of recording statements commonly made by authors in introducing notation.

Unlike regular metacommands, which may appear at any point in GELLMU source, mathsym may only appear in the preamble of an article, or, equivalently with defaults in the syntactic translator, mathsym may only appear before the LaTeX-like “\begin{document}”.

Its usage is:

 \mathsym{ symbol-name }{ symbol-rendering }[symbol-meta-info ] .

Here symbol-name is an alphanumeric string (case-sensitive) beginning with a letter. The second argument is the presentation rendering of the symbol in GELLMU markup. It is like the definition of a newcommand except that it may not involve arguments.43 The optional third argument symbol-meta-info is an alpha-numeric string that might also include possibly a few other string characters such as ‘/’, ‘-’, ‘,’, ‘.’, ‘*’, etc. Its exact structure depends on the typing system. (No typing system is part of the didactic production system.) For example, it might consist of (name, value) pairs for conveying meta-information about the symbol.

The syntactic translator replaces each invocation of a given mathsym with the specified rendering and writes for each mathsym definition a corresponding element in the SGML output whose content consists solely of the declared symbol name if there is no meta information but otherwise consists of the symbol name followed by a blank space and then whatever string of meta information is provided in the optional argument. Additionally, each invocation is wrapped in a rendering-inert Sym element whose key attribute reveals the name given to the symbol at the point of declaration (and by which the symbol is invoked). This makes it possible for a downstream authoring platform processor that has remembered the list of declared symbol names to match each invocation of a declared symbol with its associated meta information, if any, provided by the author in the symbol declaration.

A related feature in the didactic GELLMU document type is the mlg tag for marking mathematical logical groups. This is somewhat akin to the lgg tag for TeX-like logical groups, traditionally created in TeX markup with braces that are not attached to a command.44 As with lgg there is no obvious evidence of an mlg tag in a typeset rendering, but the presence of such a tag is intended as a signal to downstream mathematical parsers that the contents of the tag be given grouping priority as, say, with visible parentheses. Furthermore, the mtype and mml attributes of the mlg tag may be used to pass semantic information about the tag's contents to a processor.

7.  The Didactic Production System

The didactic production system is the suite of processors and technical support files underlying what is called regular GELLMU.

7.1.  Permission

The items of the didactic production system are copyrighted free software released under the GNU General Public License.

7.2.  Materials

The release contains everything originating with the author that is currently used in “building” GELLMU documents.

It also contains a slightly modified version of David Megginson's Perl module “SGMLS.pm” based on another slightly modified version that was furnished to me by Dave Holden in a very quick early 1999 response to my posted request for a modification that handles the labels provided (optionally) by nsgmls for SGML elements that are defined empty. A similar slight modification was also supplied a few days later by Vassilii Kachaturov and had been available at his web site.

Although the materials offered in this package aside from the syntactic translator pertain only to the didactic document type and the didactic production system, it should be understood that the larger design for GELLMU envisions other parties, on the one hand, building in various ways to extend the functionality of the didactic system, and, on the other hand, applying the methods of the didactic system to other document types and other formatting programs for those document types.

The basic items originating with the author, aside from the document type definition files (section B.2.2) are:

gellmu.el
the GELLMU syntactic translator, which makes SGML
xplaingart.pl
converts SGML to author-level XML
xmlgart.pl
converts author-level XML to elaborated XML
ltxgart.pl
translates elaborated XML to LaTeX
htmlgart.pl
translates elaborated XML to classical HTML and translates specially prepared XML to XHTML+MathML
mathcdata.pl
first of two special preparations for translation toward XHTML+MathML
mathprep.pl
second of two special preparations for translation toward XHTML+MathML
mval.pl
check for certain types of MathML errors

Since some users will only be interested in the syntactic translator, additional description of these materials is found below in “Using the didactic production system” (section 7.5) and in the Release Notes (appendix B).

7.3.  Other Required Software

To make use of the GELLMU syntactic translator a user must have or separately acquire Emacs.45 (“Windows” users should look at the special FAQ.) Emacs is commonly found on GNU/Linux systems and on *ix systems. It may be found on other systems when provided by system managers.46

To make use of the didactic production system beyond the syntactic translator a user must have or acquire the following items of free cross-platform software

• an ESIS generating SGML parser such as found in the cross-platform package SP by James Clark, which has stood the test of years, or the newer variant OpenSP, which is internationalized, from the OpenJade Team.

• a complete TeX system, for which one may look to The TeX Users Group (TUG) or The Comprehensive TeX Archive Network (CTAN).

• xmlwf — a basic utility that is part of the release of James Clark's expat.

7.4.  Using the syntactic translator

This explains how to use the syntactic translator, which is the Emacs Lisp program contained in the file gellmu.el.

7.4.1.  Operation in Batch Mode

For linux and the other *ix a script like bin/linux/g2s will be adequate if your working directory is the distribution directory47.

 Usage: g2s  stem-name   [ function-call ]

For example, if “foo.glm” is the name of the source file, then the first argument should be “foo”. The optional second argument function-call is the name of the function in the Emacs Lisp package “gellmu.el” that is to be used. The function call defaults to “gellmu-trans”, which is the correct name for LaTeX-like usage under the didactic document type (section 5).

There are also parallel scripts “g2h” and “g2x”.

g2s” will byte compile “gellmu.el” if “gellmu.elc” is not present.

g2h” runs the function gellmu-html for the case where the GELLMU file has been written directly for HTML. The file ghtml.glm and the derived file ghtml.html are examples.

g2x” runs the function gellmu-xml for the case where the GELLMU file has been written directly for XML.

The directory “bin/win32” contains parallel, though more complicated, batch files for use in a “DOS” command line under “Windows”.

7.4.2.  Interactive Operation

Open GNU Emacs interactively on the GELLMU source file. When finished editing the source, save it but keep Emacs open. Then do

and

 M-x gellmu-trans .

(It is better to have byte-compiled “gellmu.el” and if the byte-compiled version “gellmu.elc” is in your Emacs load-path, then

is faster.)

The SGML output should come up in a second buffer. Save that buffer to save the output.

Make any corrections or changes in the GELLMU source buffer and re-run

 M-x gellmu-trans .

As with batch operation the functions gellmu-html and gellmu-xml, may be handled parallel to gellmu-trans.

There are a number of other functions besides these three for obtaining syntactic translation from GELLMU source to SGML. Each of these is, in fact, a front for calling gellmu-trans with various combinations of variable settings. There are a great many user configurable variables in the syntactic translator. Notable among these for regular GELLMU (section 4.1) are (1) gellmu-parb-nogo and (2) gellmu-autoclose-list. See the variable documentation text, available interactively when the GELLMU library is loaded in Emacs with the key combination qquostrC-h C-h v, for more information. For a list of the names of all user configurable variables see the variable documentation for gellmu-public-vars.

For example, setting gellmu-verblist true causes a sequence of lines beginning with the line “\begin{verbatim}” and ending with the line “\end{verbatim}” to be considered verbatim as in LaTeX, i.e., without requiring escaped forms of special characters, and then to be set as a simple verblist, which is in most circumstances superior to GELLMU's version of pre-historic verbatim.48

7.4.3.  Interrupting the Syntactic Translator

Interruption of the GELLMU syntactic translator will be necessary in the event that the combined use of newcommand, macro, Macro, and mathsym (advanced mode only) leads to infinite recursive loops. Users should avoid the use of macro and Macro unless such use is absolutely necessary since these metacommands present greater looping risks.

Inasmuch as there are two ways to invoke the syntactic translator, there are two different procedures for interrupting it should that be necessary.

Batch mode invocation  This is the case when GNU Emacs is launched in batch processing mode to run the syntactic translator. To interrupt the syntactic translator in this case one must interrupt the Emacs process. The author does not know of any case when Emacs does not respond to standard interrupts. For example, on linux systems pressing “Control-C” when the process was launched from a shell provides a standard interrupt. If the processed was launched in some other way, a normal “kill” addressed to the process should have the same effect.
Interactive invocation  This is the case when the syntactic translator is launched from within the GNU Emacs editing interface. Use the standard Emacs function “quit”, accessed with the key “C-g” (Control-G) to interrupt the syntactic translator.

7.5.  Using the didactic production system

7.5.1.  Staged Design

The items in the didactic production system are components for use with staged processing. The document type may be used with any SGML system. Of course, one may not use a parser that is limited to XML with the SGML version of the document type. Moreover, if one makes use of features in the SGML version such as the positional argument and option elements, then one might want to provide translation to the XML version of the document type.

No particular processing system is required for the XML version of the document type. For example, one might profitably write an XSLT sheet for translation to some other format and then submit the document and the XSLT sheet to an XSLT engine such as xt, xsltproc, or saxon.

7.5.2.  Default Staging

The release includes bin/linux/lmkg and bin/linux/mmkg as example driver scripts for running the following sequence. (The bin/win32 directory contains old driver scripts for the MS Windows command line that might someday be worth updating.) The behavior of these driver scripts depends on the way they are called though the specific of this are somewhat different for lmkg than for mmkg. The older lmkg scripts do not generate XHTML+MathML at this point they are provided primarily for backward compatibility.

The mmkg scripts by default make XHTML+MathML but not if called by a name, e.g., via a symbolic link, without the substring “mm”. If an mmkg script is called with a name ending in the string “froms” or “fromx”, then it will take as starting point, respectively, an SGML, i.e., “.sgml”, or author-level XML, i.e., “.xml”, version of the document. Thus, for example, mmkgfromx might be used to operate on a document that is an author-level XML translation from a non-GELLMU source.

Caution. These scripts, like all shell scripts, should be examined for file system locations, system environmental variables, and other platform-specific and location-specific issues. The user who introduces a script on a platform should understand the script. A user who does not understand a script should not attempt to introduce it on a local platform.

Flow in the didactic production system is portrayed in the following diagram:

These are the stages in the didactic production system pipeline:

1. Prepare GELLMU source using a text editor.

2. Process the source with the syntactic translator to obtain an SGML document under the didactic document type.

3. Use nsgmls to validate the SGML document and obtain an ESIS for it as output.

4. Submit the SGML ESIS as input49 to the Perl program sgmlspl with the script xplaingart.pl as file argument, obtaining an author-level XML document.

5. Use nsgmls to validate the author-level XML document and then submit its ESIS as input to to sgmlspl with the script xmlgart.pl, obtaining an elaborated XML document. This document, which is accompanied by several auxiliary files50, has things such as sectional unit numbers and cross references fully resolved so that there will be consistency in these across the various output formats.

6. Use nsgmls to validate the elaborated XML document and submit its ESIS as input multiply to sgmlspl:

1. with the script htmlgart.pl to obtain a classical HTML document that then will be validated if an HTML validation program is identified in the driver script.

2. with the script ltxgart.pl as file argument, obtaining a LaTeX document. The LaTeX document is then built with latex to make a DVI file and with pdflatex to make a PDF file.

3. for a pipeline using successive runs of sgmlspl with 3 scripts, mathcdata.pl, mathprep.pl, and htmlgart.pl (called in a special way) to make a XHTML+MathML file that is then checked for XML well-formedness using xmlwf, checked for certain kinds of MathML errors using sgmlspl with mval.pl, and validated if a suitable validation program is identified in the driver script.

7.5.3.  Parsing with nsgmls

The program nsgmls is part of the SP package, which includes extensive documentation. Those familiar with it will want to ignore these hints.

Since for both the SGML and the XML versions of the didactic document type SP requires non-default SGML declarations, it is recommended that the user employ SGML catalogs, one for SGML and another for XML. The file system location of a catalog is conveyed to nsgmls as the value of its command line argument immediately following the argument “-c”.

Each catalog should contain an SGMLDECL directive that is the file system location of an SGML declaration. Aside from that a catalog may contain a number of three string lines of either of the following forms

where the quoted pathname, which may be relative to the location of the catalog, should for this context in each case be that of a DTD file.

It is recommended in each case that nsgmls be run with arguments “-l” (for propagating line number references) and “-oempty” (for flagging defined-empty elements). For processing the XML version of the didactic document type one should additionally use the argument “-wxml”.

Additionally, a user may wish to make locally-specific arrangements for the handling of character sets.

7.5.4.  Processing with sgmlspl

The program sgmlspl is part of David Megginson's SGMLSPM package. Megginson's extensive documentation for it may be found in the (December 1995) release found at CPAN.51

One uses sgmlspl by calling the Perl program sgmlspl with an ESIS as input and a script as argument. Additional arguments become arguments for the script.

Although some operating systems provide a way for dealing with a Perl program, which is stored in a text file, as an executable object, in other cases one must explicitly call the Perl engine as a program with an ESIS as input, the system name of sgmlspl as first argument, and (the system name of) a script as second argument. In both cases one will want to arrange, perhaps with an environmental variable or perhaps with the “-I” argument to the Perl engine, for the directory containing “SGMLS.pm” and its supporting module “SGMLS/Output.pm” to be in its path array @INC.

7.5.5.  Environmental Variables

There are a number of environmental variables that affect processing in the didactic production system. The names all begin with the string “GELLMU_”. Of course, the names are case-sensitive.

Many of these variables are set in the distributed driver scripts. When that is the case, the distributed driver scripts commonly check for a previous setting (which may, therefore, be easily placed in a fronting script that makes a setting and then just calls the distributed driver).

Setting environmentals can be difficult in a non-Unix-like operating system environment. This is one reason why the author generally recommends that Windows users install GELLMU under Cygwin.

GELLMU_Dir
The top of the directory tree where GELLMU is installed.
GELLMU_StyleDir
URI or directory location of CSS and XSLT style sheets that is used by the didactic production system in writing links in XML, HTML and XHTML+MathML files. The value usually has a different meaning under the eye of a web server than in a local file system. A relative URI or path is usually best. A value like “../webstyle” can often be made to work both ways.
GELLMU_CSSName
Name, relative if given relative syntax, to the value of GELLMU_StyleDir, of the CSS stylesheet written by the didactic production system in HTML and XHTML+MathML files.
GELLMU_XhtmlSuffix
Suffix given to XHTML or XHTML+MathML files written by htmlgart.pl.
GELLMU_MathJaxURL
URL for the version of “MathJax.js” used for the HTML5 + MathJax output. This defaults to the latest version on the MathJax CDN server.
GELLMU_NoUMSS
Value 0 or 1: if 1, signals to htmlgart.pl that it should not link to W3C's UMSS XSLT stylesheets.
GELLMU_UTF8
Value 0 or 1: signals to sgmlspl scripts that Perl's handling of the UTF-8 text encoding should be invoked. The meaning is subtly different between Perl versions 5.6 and 5.8.
GELLMU_Encoding
String value for text encoding that is set by the xplaingart.pl in writing author-level XML and by xmlgart.pl in writing elaborated XML. (HTML, XHTML, and XHTML+MathML are always written with the UTF-8 encoding.)
GELLMU_LaTeXUTF8
Value 0 or 1: signals to latex and pdflatex to expect the UTF-8 encoded text in their input.
GELLMU_LaTeXStyle
Pathname for LaTeX stylesheets that latex and pdflatex should use when such stylesheets are not properly positioned for TeX system KPSE-based location. (It's better to use a local or personal TDS tree.)
GELLMU_PAPER
String value for the paper used in printing; becomes an option for the documentclass command in the output LaTeX file.
GELLMU_Memoir
Value 0 or 1: if 1, use the memoir, rather than article, documentclass in the output LaTeX file.
GELLMU_DefaultEmptyEqncenter
Experimental. String value consisting of a small bit of LaTeX wrapped as a Perl string to use in tweaking the LaTeX-rendered appearance of a GELLMU eqnarray (which is rendered in LaTeX using either align or aligned, depending on numbering arrangements) in the case of an empty middle cell (eqncenter). The current default value used in ltxgart.pl is the string “ \qquad ”. Be mindful of how such a string can be entered as a literal string in Perl or as part of an on-the-fly environmental setting from a command line shell.
Value 0 or 1. How to handle links in MathML output when writing XHTML+MathML. Such links, which are currently illegal inside XHTML+MathML math zones, are confined to \text{...} areas in GELLMU. If the value is 1, use XLink; otherwise, switch into the HTML namespace and write an HTML anchor. (Firefox handles both, while more of the other browsers seem to choke on the namespace switch than choke on the necessarily cumbersome use of XLink.)
GELLMU_OriginLabel
Name for an automatic label key, chiefly of occasional value for HTML and XHTML+MathML outputs, that, when this variable is present in the environment, places a link target, with id the value of this variable, at the top of the document.

Appendix A.  The GELLMU Archive

The GELLMU Archive is the web site http://www.albany.edu/~hammond/gellmu/.

It is the source for late breaking information about GELLMU. Among other things, it houses a largely uncommented archive of examples. This is provided in the belief that the study of examples is one of the quickest ways to learn a markup language.

Of course, this document, which is furnished with the release, is also an example.

Another item, also an example, that is housed in the archive is The GELLMU FAQ.

Appendix B.  Release Notes

This version of the manual was prepared for release 0.8.5 in July 2007. The GELLMU materials (section 7.2) consist of:

1. The manual, which is this document.

2. The GELLMU syntactic translator, a Emacs Lisp program, which is the only item of software required for those who simply wish to use GELLMU markup for the conscious preparation of HTML documents or documents under some other classical SGML or XML document type for which the user is otherwise equipped.

3. The GELLMU didactic production system, which consists of an SGML document type called article, an XML document type also called article, and three separate collections of Perl functions for the well-known Perl SGML processing framework sgmlspl by David Megginson. A very slightly modified version of Megginson's Perl library SGMLSpm that provides a method for detecting defined-empty SGML elements, as flagged in an SGML parse stream in ESIS format, is included as part of the didactic production system. Since it is by size 60% of the software content of the Megginson package on CPAN, the rest of the package, licensed under the GNU General Public License is distributed with the didactic production system as well, though without its documentation. The distribution includes 7 scripts for use with sgmlspl in the didactic production system pipeline. For more on this see “Using the didactic production system” (section 7.5).

B.1.  Comments on the Syntactic Translator

The GELLMU syntactic translator is more mature than the other components. The following comments pertain to it.

internationalization

Internationalization has a considerable and evolving level of support in Emacs. The concept is that an author resides in a locale. When the author enters a character from a locale, it gives rise in Emacs to a somewhat complicated multibyte entity that can have “properties”. Particularly relevant variables in Emacs are: coding-system-for-write and buffer-file-coding-system. GELLMU provides the user variable gellmu-sgml-default-coding, which should be properly coordinated via driver script settings with one's SGML parser.

inclusions

It is not actually a limitation that a GELLMU source file cannot be included in another. The primary reason is that one should make use of the inclusion mechanism of SGML. For that one needs to define the included pieces as entities in the direct internal subset 52 of the source file and then reference each as an entity, e.g., “&sect2;” at the appropriate location in the source file where it is to be included. Because the inclusion happens at the SGML level there are two points to observe:

1. Macro information is local to each source file.

2. The situation is optimal for the location of validation errors provided that one's parser reports such errors by filename and line number since the syntactic translator provides line number alignment between source and generated SGML.

A second reason is that source inclusion would disturb line number alignment between source and SGML output. This is important for the interpretation of SGML validation error messages. Such validation is considered routine, and plays an important role in detecting an author's mistakes. Some author errors are diagnosed in the syntactic translator.

A third reason, which at the same time might be considered also a disadvantage, is that all of GELLMU's macro facilities are local to each source file. This adds both robustness and flexibility at the price of the inconvenience of physical inclusion of common macro definitions.

variable management

This refers to the management of user variables in the syntactic translator. These are Elisp variables. One who is familiar with Elisp should be able to provide values in batch mode without making changes in the Elisp source.53 Setting values interactively in the Emacs editing interface can be done easily using “M-x set-variable”.

With a future release it is likely that additionally a user resource file for custom variable settings without the need for writing Elisp code will be provided.

Bugs

No serious existing problem is known in the GELLMU syntactic translator at the time of this release. Of course, as stated in source code comments, there is no warranty of any kind. Please report bugs to the author: hammond@math.albany.edu.

• Reserved element names. The GELLMU syntactic translator reserves for its own internal use all SGML or XML element names in which the first numeric character in the range “0–9” is the character “0”.

• Limitation on braces in macros. Unbalanced braces are not permitted in either the name or the value field of any form of macro metacommand.

• First cell limitation. In the LaTeX-like emulation of an array or tabular environment the first cell in each row must have something other than white space. Of course, sometimes no content is wanted, and then \empty (for nil markup, not to be confused with the mathematical \emptyset) is one way to handle it, but this author usually uses something that is mostly inconsequential like “~” or “\,”. Another way to handle it is to invoke the names of the SGML elements, i.e., \firstcell{} for tabular or \firstacell{} for array.

• Concept of advanced GELLMU immature. Inasmuch as didactic article is the only document type for which the idea of advanced GELLMU has been implemented, the general concept of advanced GELLMU is not fully developed in the GELLMU syntactic translator. Basic GELLMU is characterized in the GELLMU syntactic translator by the evaluation of the Boolean variable gellmu-straight-sgml to “true”. This automatically make the Boolean variable gellmu-regular-sgml true. Full LaTeX-like support for the didactic production system is realized only by both of these Booleans being false. Thus, advanced GELLMU will need to evolve in the space in between, probably after the introduction of further such Boolean variables, some public and some private. This technique will make it possible for the code to continue performing as now when the variable gellmu-straight-sgml, the flag for basic GELLMU, is true and also when both of these flags are false.

• Reserved strings. The strings “<<” and “>>” have been reserved as future notation for mathematical objects. Although it might seem at first glance that this type of short hand has no place apart from the fully LaTeX-like environment of the GELLMU syntactic translator in the context of the didactic production system, in which they have not yet been used, it is actually not so clear that one could not make sensible use of such notation in the context of “XHTML plus MathML” under advanced GELLMU along with other features such as blank lines for new paragraphs and many other mathematical shortcuts. It awaits the further development of advanced GELLMU, and reserving this notation is necessary to ensure backward compatibility.

Consequently, for example, entering “<<a>>” is problematical, because only the first “<” or “>” will be converted to something appropriate. In basic mode “&lt;” and “&gt;” are one-step ways of circumventing this when these entities are available, which is guaranteed for any form of HTML as well as for any form of XML. In the didactic production system one should use “\ltc;” and “\gtc;”. For other cases the one-step circumvention is to use entity references to the numeric character codes, e.g., in ASCII “&#3C;” and “&#3E;”, and for convenience these may be brought up as macros, perhaps “\lt” and “\gt”.

B.2.  Comments on the Didactic Production System

The didactic production system is to be understood as a potential base for development. As such it is not intended ever to offer everything that might be imagined. The following comments pertain to it.

B.2.1.  Internationalization

Internationalization has been a concern of the project. It is possible, for example, to use the ISO-Latin-1 character ‘é’ in the name of an element. The didactic article document type offers, for example, an element étale, which is a style, parallel to bold. Use of the character ‘é’ as a raw word character data with the didactic production system is less robust than the more careful “\acute{e}54 construction, which is desirable for translation of article to formats that do not support latin1. For that matter, the exact extent of LaTeX's support of latin1 is a bit tricky, and the whole matter of internationalization is currently under review in the LaTeX project.55

B.2.2.  Document Type Definitions

Currently the project has one SGML document type definition and two XML document type definitions. Files under the various document types are suffixed as follows:

 First stage SGML .sgml Author Level XML .xml Elaborated XML .exml

Additionally, in the three steps of processing to generate an XHTML+MathML file from an elaborated XML file there are two intermediate XML files generated, the first with suffix “.yml”, which lives under the document type definition for an elaborated XML document, and the second with suffix “.zml”, an XML file for which there is no extant formal document type definition.56

The author-level XML document type is formalized by the DTD "axgellmu.dtd", while the elaborated XML document type is formalized by a modification that is found in the DTD "uxgellmu.dtd". (The latter was the only XML document type used with the regular GELLMU production stream prior to October, 2006.)

The document type represented by "uxgellmu.dtd" is now called the elaborated XML document type.

The author-level XML document type is suitable as a translation target from other markups. The elaborated XML document type should not be used as a translation target other than from the GELLMU author-level XML document type.

All document type definitions are available under the UTF-8 text encoding. The two older document type definitions will continue to exist for a while under the Latin-1 (ISO-8859-1) text encoding. The text encoding of a so-called DTD file (not quite the same thing as a document type definition) is significant in regard to the names of SGML/XML entities and elements rather than in regard to document instances which might be processed. The names of the DTD files are:

 Latin-1 UTF-8 First stage SGML gellmu.dtd ugellmu.dtd Author Level XML axgellmu.dtd Elaborated XML xgellmu.dtd uxgellmu.dtd

B.2.3.  Translation to XML

Presently the author-level XML files link to a CSS stylesheet that provides primitive rendering. One could go further in this direction, but the rendering of mathematics will be limited without more development in that direction of CSS.

B.2.4.  Translation to HTML
Math in classic HTML  The classic HTML output does not use graphic images for mathematical zones in the manner of programs like latex2html. Instead it uses pseudo-TeX notation for math. There are a number of reasons:
1. Well typeset mathematics is available in the modern form of HTML that is more precisely called XHTML+MathML.

2. Graphical images completely block accessibility in the sense of the World Wide Web Consortium's Web Accessibility Initiative.

3. The present classical HTML output files may be deciphered in terminal window browsers.

4. The present classical HTML output files may be “dumped” to plain text using a program such as lynx or w3m for various sometimes useful purposes.

Style.  HTML and XHTML+MathML made with the didactic production system now rely on CSS, even, for some things, level 2 CSS.
B.2.5.  Translation to LaTeX

This translator writes LaTeX2E. A number of packages, including graphicx, amsmath, amssymb, amsfonts, bm, url (not hyperref for the standard track where the focus is on printed output), and inputenc for UTF-8 (which may be needed even if the GELLMU source or, otherwise, the author-level XML source is not UTF-8 encoded). Apart from current font availability issues, the author would have preferred to invoke the T1 font encoding.

Even though GELLMU source uses the names equation and eqnarray, in the LaTeX formatting amsmath constructions are used.

A small modification of this translator can be used to write Adobe's Portable Document Format (PDF) with pages sized for screens rather than for paper.

B.2.6.  Future Plans

This is a very limited list.

A literate document type definition.

Capable of spawning not only the 5 DTD's but type definitions under other mechanisms such as, for example, RelaxNG.

Mathematical Semantics

Provision for optional semantic tightening sufficient for authors wishing to be able to export mathematical markup into computer algebra systems.

Footnotes

1. * The syntactic translator does, however, have some facilities for classifying the names in a list in regard to common syntactic behavior. See, in particular, the Elisp variables gellmu-autoclose-list and gellmu-parb-hold, both of which are not significant in basic GELLMU.
2. * With several minor exceptions, one related to the direct writing of SGML attributes (which cannot contain markup and which do not have many parallels in LaTeX) and another related to the way of escaping the character ‘\’, everything about basic mode also applies to advanced mode.
3. * There is a way, with the setting of several variables for the syntactic translator in advanced mode, to have blank lines begin new paragraphs in basic input for HTML
4. * This means that it is a relatively slow form of processing, but the author believes that it is a good match for the intuitive expectations of most LaTeX users.
5. * In a future release an alternative metacommand called frontcommand may be provided which could be used, for example, if one wishes to have a macro name of some kind match the name of an actual SGML element.
6. * In the prototype production system based on the didactic article document type the output from each stage is available for examination and, where necessary, intervention. However, such use of intervention is intended only for temporary expedient use while a GELLMU system is being designed or enhanced. As with LaTeX, enhancement is an ongoing process.
7. * In handling GELLMU source markup one could provide a more elaborate processor that can be configured to know for each such command the list of names for its positional arguments and options. It was decided that this goes somewhat beyond syntactic handling but that the question of whether a list of arguments and options corresponds to sole content might be regarded as a syntactic matter.
8. * Its use is optionally permitted in basic mode as well.
9. * Indeed, LaTeX usage allows markup in options, but (element level) markup is not permitted in SGML attributes. Note, however, that in the didactic document type the SGML content model for an option is more restrictive than that for an argument. Also in the didactic document type some options, such as that for anch, which is described later in this section, are practically required.
10. * To say seldom is not to say not. Two important instances in the didactic production system are the series attribute for the label command, which stands in, to the extent possible, for the notion of counter in LaTeX, and the type attribute for the series command, which provides emulation of counter conversion from, say, number values to letter values.
11. * In the XML version of article the option becomes the element anchref and the anchored text becomes the element anchv. One may use these names directly in GELLMU source, but option/argument notation is more familiar and more succinct.
12. * There was no list of this type in early pre-release versions of the syntactic translator.
13. * The LaTeX concept of document class has only a loose correspondence with the SGML concept of document type.
14. * It is not inconceivable that at some future point conscious writing for some XML document types using LaTeX-like markup might be subsumed in the LaTeX project, but in saying this, the author is neither predicting it nor assessing the merits of the idea. He has no affiliation with the LaTeX project other than as a user.
15. * Strict discussion of an SGML document type would not allow use of the word handled. In this instance a coordinated pair of document types is being described, one SGML and the other an XML translation. For most purposes the SGML document type is the richer of the two. However, because of its use of a handful of generic elements (in its reserved namespace consisting of names that contain ‘0’ (zero) as first numeric character) for modeling certain convenience features of LaTeX, it is possible for a correct translation of a valid SGML article to yield an XML version that fails validation because the content models of the generic elements are necessarily loose.
16. * HTML, version 5, which, as of March 2011, is supported by the “big four” web browsers
17. * MathJax, which was jointly developed by the AMS, SIAM, and Design Science, with support from other organizations, imposes no requirement on the user other than that of having a current web browser on a screen of sufficient size.
18. * The variable gellmu-xml-strict is by default unset in advanced mode.
19. * The correct LaTeX markup is “\LaTeX{} document”. In the didactic production system the name of LaTeX is “latex”, which is a defined-empty element, that for the SGML version of article may be marked up safely either as “\latex;” or as “\latex{}”.
20. * In math ‘[’ sometimes needs to be escaped to prevent confusion between its markup use and its ordinary use in an instance such as the markup for Z[t]. The syntactic translator would need to know vocabulary — at least the argument/option pattern for mathbb — in order to elude the syntactic ambiguity.
21. *\~” is an example of a markup string that is defined in LaTeX (for an accent) that is not defined in the didactic document type. A LaTeX user may recover a prior markup habit of this type using newcommand possibly in combination with macro. For more information see the discussion of accents (section 5.8).
22. * It is possible to merge the inline math and tmath zones at any level of processing beyond the syntactic translator. These are indeed the same in LaTeX, but the syntactic translator resists the temptation here to go beyond syntax and merge them. With the didactic article document type the formatting to LaTeX inserts the LaTeX markup “\,” for a small horizontal space before and after math, but not before and after tmath.
23. * The syntactic translator simply outputs the SGML defined-empty element brk0, which belongs to its reserved name space. The dual use of brk0 involves some SGML chicanery that is resolved during translation to the XML version of the article document type, where tabular is converted to table and non-tabular use of brk0 is converted to brk. See also the handling of array, which is different even though the source markup, as in LaTeX is similar.
24. * Alternate forms “\aos;”, “\aoq;”, “\aoe;” of sentence ending punctuation are provided that may be used following inline mathematical markup at the end of a sentence. Similarly, “\aoc;” is an alternate form for a comma.
25. * Or one could use:
\Section{\sectiontitle{Some...}
\para A para...  . . .}
instead of using the environment-like begin/end construction.
26. * In fact, classical SGML document types are often even more elaborate than this.
27. * Caveats:
28. * It is recommended that the characters in label key strings be restricted to lower case ASCII letters, the digits 0–9, and possibly the character ‘-’ or the character ‘.’ for maximal inter-operability with current and future formattings. For example, the ‘_’ is problematical in this context.
29. * Although one could provide SGML modeling for LaTeX's counters, it would not be very much along the lines of main track SGML or XML document types.
30. * The value strings may contain simple markup such as, for example, “\tld;” to provide robust multiple output processing of ‘~’ whereas an attribute option may not contain markup.
31. * A klabel is a “visible key” label.
32. * It may be left empty, but it must be present.
33. * The presence of sopt does not cause a table of contents to be produced automatically. For that one uses “\tableofcontents”. Moreover, the presence of sopt should have no effect upon a manually constructed “\TableOfContents”.
34. * The didactic production system offers a way to furnish a formally empty string, which is an empty element called empty in the document type for use in places such as the the table of contents option of a sectional unit, where it is not otherwise possible to distinguish after parsing whether deliberately empty content was specified by the author. That is, the markup “\sopt{}” furnishes an empty string which, in turn, signals “no sopt”, while “\sopt{\empty;}” indicates that empty content was specified for sopt.
35. * When editing for HTML one may, of course use HTML's pre, which stands for “pre-formatted text”
36. * With a sufficiently long list of output format candidates each of the 33 non-alphanumeric printable ASCII characters is unsafe. However, one might use an external character-to-string conversion program to prepare a large amount of verbatim material for inclusion inside the simplistic verbatim command in GELLMU source.
37. * To export this procedure for general advanced mode usage, all of the names used need to be made user-configurable.
38. * Familiar short forms may be introduced by a user using the macro facility.
39. * For the third method one's GELLMU driver script must provide appropriately for the text encoding of the source file.
40. * This meaning of scale differs from the meaning of scale with includegraphics under LaTeX's graphicx package. New controls of this type may be introduced in a future version.
41. * Note that the use of lbalbr in this instance is insufficiently semantic for translation to content MathML while it is meaningful for translation to presentation MathML. Adequate enhancement might be had by providing mml="cases" (using a name from amsmath) as an attribute for lbalbr with this example.
42. * There is also a default counter that is used when no label series name is present. That counter simply is the position of the underlying assertion in the list of all assertions.
43. * However, a declared math symbol may be invoked in a newcommand that takes arguments.
44. * Such unattached braces in GELLMU markup lead to an lg0 tag in the output of the syntactic translator that is translated to an lgg tag in the XML version of the didactic document type.
45. * Version 20 or later should be adequate. Although the author began this project using version 19, he is no longer able to run tests with that version.
46. * It is an embarrassment of the business world in the years since 1985 that many business computing installations do not provide general purpose cross-platform programming systems despite the widespread availability of excellent free robust systems such as Emacs (for Lisp), gcc (for C), and Perl. This new phenomenon apparently arises not so much from lack of organizational interest but from the fact that the responsibility for maintenance cannot be passed beyond the local system manager to a vendor.
47. * None of the enclosed scripts either for linux or for win32 should be used without prior examination and verification for suitability.
48. * The main reasons that this version is not the default with a call to gellmu-trans are:
1. It breaks the paradigm under which a GELLMU command name is the name of an SGML element.

2. It breaks backward compatibility with earlier versions of the syntactic translator, i.e., breaks older documents.

3. It is felt that the user invoking verblist this way should be aware of what is being done.

Note that direct invocation of verblist requires escaping special characters. Thus, using the function call gellmu-verblist converts the name verbatim from a command name to a meta-command name.
49. * Specifically, this mention of “input” refers to what is called “standard input” in a command line situation. There may be a challenge here on platforms that do not provide a command line.
50. * Formally, two of these auxiliary files are considered part of the elaborated XML document.
51. * At the time of this release there was discussion in the UseNet newsgroup news:comp.text.sgml about a proposed revision of SGMLSPM by another party. The code for “SGMLS.pm” included in this GELLMU release contains a very small modification of Megginson's 1995 release.
52. * The direct internal subset is the content of the optional argument of the documenttype metacommand that follows its required argument. It should be noted in the didactic production system that the direct internal subset cannot be propagated to the XML form of article because it is digested by any standard SGML parser and, hence, by any translation based on a standard parsing. Thus, any pieces are merged in the XML form of an article although the translator xmlgart might be modified to construct an internal declaration subset there and provide partitioning of the XML version among filesystem pieces based on document structure.
53. * Please observe the rules of the LaTeX project regarding filenames as well as the license rules of the GNU General Public License if you wish to distribute a modified version of the Elisp source. Alternatively, the author is always interested in learning of suggestions for change.
54. * The corresponding usage in LaTeX would be “\'e”; this could be brought into GELLMU source using \macro, but it must be resolved to a name in the output of the GELLMU syntactic translator where everything that is markup needs a name. Rather than using a general container acute, the document type could have provided a name for the specific character.
55. * Alternatively LaTeX source can be submitted to the program lambda which is the LaTeX format for the program omega that is now under development as a substitute for Knuth's TeX, the program, with internationalization as a stated goal.
56. * There is no formal document type definition for a “.zml” file because such a file is endowed via XML attributes with information about tree structure for mathematical zones.