Introductory User's Guide to Regular GELLMU

William F. Hammond

last corrections: July 18, 2012

1  Introduction... *
2  First Steps... *
2.1  Acquiring and Installing... *
2.2  Writing an Article... *
3  Basic Markup Within Sentences... *
4  Sectioning... *
5  Lists... *
6  Tabular Environment Emulation... *
7  Labels, Cross References, and Anchors... *
8  Mathematics... *
9  Beyond this Introductory Guide... *

1.  Introduction

Regular GELLMU is the part of the GELLMU Project under which source language markup most closely resembles actual LaTeX source markup. Much of the markup vocabulary is the same as that of actual LaTeX. The differences — and, indeed, the whole reason for the GELLMU Project — arises from the author's conclusion in 1998 that it would not be possible in a robust way to write rules for actual LaTeX source markup so that the collection of documents in the article document class prepared under such rules would admit fully reliable automatic translation to other formats1.

The version 0.8 release of the GELLMU Project was the first release providing an output stream for the XML form of HTML extended by the World Wide Web Consortium's Mathematical Markup Language (MathML). The driver script mmkg provides translations of regular GELLMU articles to (1) actual LaTeX (and from there to DVI and PDF), (2) classic HTML, and (3) XHTML+MathML, while the driver script mkg omits XHTML+MathML. Though the PDF output may be read on a screen, it is intended primarily for printing by end consumers.

While one could provide a landscape version of the PDF output with live anchors for screen consumption, the XHTML+MathML version would still be superior for screen reading since, as with ordinary HTML, the text is re-formatted to conform naturally both to magnification and window-resizing actions by the end user. The quality of the rendering of mathematics in the XHTML+MathML version is very nearly as good as that in the PDF version.

Indeed the author, as an author, finds that the best venue for proof reading what he writes is XHTML+MathML when displayed in a suitable web browser2 with larger than normal font size and narrower than normal line length.

Moreover, the XHTML+MathML version is fully compliant with the Accessibility Guidelines of the World Wide Web Consortium.

This document is an introductory user guide. The relation of regular GELLMU to the GELLMU Project overall is explained in the Manual, and advanced user information on regular GELLMU is found there as well. This document covers fundamentals not mentioned in the manual. It is available in PDF, XHTML+MathML, and classical HTML versions. Online copies of this Introductory Guide and of the Manual may be found at CTAN and the GELLMU web site.

To use this document as an example for learning refer to its source markup and to use the project manual as an example refer to the manual's source markup.

2.  First Steps

2.1.  Acquiring and Installing

GELLMU is implemented as cross-platform free software licensed under the GNU General Public License. Its package is available from the Comprehensive TeX Archive Network (CTAN) and from the GELLMU web site.

The package requires a number of other cross-platform free software packages including GNU Emacs, Perl, and some standard items of SGML/XML software, chiefly Open SP. Locations may be found in the Manual, key “requiredsoftware.

Linux: The required packages are generally part of a full GNU/Linux distribution. The package should be installed in /usr/local/gellmu and symlinks to driver scripts should be made from a suitable place in one's command path.

MacOS-X and other Unix variants: The only difference from GNU/Linux is that some of the supporting packages may need to be acquired and installed.

MS Windows: The best strategy is to acquire and install a full Cygwin distribution. Then proceed as with Linux.

Additional information may be found in the Manual.

2.2.  Writing an Article

The minimal form of a regular GELLMU article is this:

\documenttype{article}
\title{Some title}

\begin{document}

First Paragraph.

Second Paragraph.

. . .

\end{document}

An article must begin with \documenttype{article}. The initial part of an article up to the required \begin{document} is the article's preamble, and the rest of the article is the article's body.

Almost everything in the body of an article must be part of a paragraph. Most typically a paragraph is begun with a blank line. Paragraphs themselves may be parts of sectional units.

Beware special characters. For example ‘%’ converts the rest of its line to comment status. To use ‘%’ in text write “\%”. The special characters are with small exceptions the same as those in LaTeX. Note, however, that in dealing with multiple downstream formats any character that is not alpha-numeric becomes potentially special. Therefore, there is a named form for every non-alphanumeric ASCII character. For example, “\tld;” is the name of ASCII ‘~’, which is sometimes handy when forming a URL.3

The names of special characters may be found by browsing the SGML document type definition in the file gellmu.dtd at the top of the distribution tree.4

3.  Basic Markup Within Sentences

Although there is no markup for a sentence as a unit, there is markup for the end of a sentence or a “full stop”. The normal way to indicate the end of a sentence is to use an ASCII period ‘.’ followed by either a newline or two or more blank spaces. An end of sentence is marked up as the SGML empty element eos. This is automatically generated under the condition indicated above. For most purposes it will not matter if an ASCII period is followed by only a single blank space, but in that case no eos will be generated.

The foregoing paragraph provides several instances of markup within sentences other than eos. There are (1) a quoted phrase, (2) two abbreviations, (3) a quoted character, and (4) two instances of emphasized text. The source for that paragraph is:

Although there is no markup for a sentence as a unit, there is markup
for the end of a sentence or a full stop''.  The normal way to
indicate the end of a sentence is to use an \abbr{ASCII} period
\quochar{.}  followed by either a newline or two or more blank spaces.
An end of sentence is marked up as the \sgml empty element \emph{eos}.
This is automatically generated under the condition indicated above.
For most purposes it will not matter if an \ascii period is followed
by only a single blank space, but in that case no \emph{eos} will be
generated.

The full stop'' is a marked up quoted phrase. The named form of a quoted phrase is quophrase. The markup \quophrase{full stop} is the equivalent expanded form. An abbreviation is indicated with the command abbr. There is, of course, no requirement that an abbreviation be marked up; it is merely good practice.

The name of markup for emphasized text is emph. A common alternate form of emphasis is called bold. One way the two differ is that emph has order two effect, i.e., emph within emph removes emphasis while bold is not permitted within itself. Note that bold may be emphasized.

In the cited marked up text above the instances \sgml and \ascii are user-defined newcommands occurring in the preamble (or at least before the invocations):

\newcommand{\ascii}{\abbr{ASCII}}
\newcommand{\sgml}{\abbr{SGML}}


The markup \quochar{x} produces a quoted character: in this case ‘x’. There is no compelling reason why quophrase could not have been used instead of quochar, but it is customary in technical documentation to use a different presentation style for quoted characters than for quoted phrases. Beyond that for non-traditional forms of processing it can be very useful to provide for semantic distinctions such as the distinction between a quoted phrase in normal discourse and a quoted string in technical discourse. More particularly, it is significant that in some languages for computer programming there is an important distinction between a character and a string of unit length. Some of the markup commands useful with technical documentation are

 command example in source example rendered quochar use \quochar{\%} to comment use ‘%’ to comment quostr \quostr{unsigned long num;} unsigned long num; qquostr type \qquostr{done} to exit type “done” to exit path \path{C:\bsl;TEX\bsl;TEXMF} C:\TEX\TEXMF urlanch see \urlanch{userdoc.glm} see userdoc.glm

With regular LaTeX standard practice is to use texttt for the examples just cited involving quochar, quostr, path, and, apart from web-related considerations, urlanch. The name url has been used variously in regular LaTeX; therefore, url has not been introduced as a name in regular GELLMU so that an author may safely use it, if desired, as a macro name.

4.  Sectioning

Unless one wants fancy non-default behavior, one may provide sections, subsections, and subsubsections as in LaTeX. Thus, for example,

\documenttype{article}
\title{Some title}

\begin{document}

\section{Title of first section}
First Paragraph.

Second Paragraph.

. . .

\section{Title of second section}
First Paragraph.

Second Paragraph.

. . .

\end{document}

With this style of markup one does not indicate the end of a section although there is a way, described in the Manual, key “secusage, to gain finer control over sectional units.

5.  Lists

As with LaTeX GELLMU article provides lists with the names description, enumerate, and itemize. For example, the markup

\begin{enumerate}
\item bird
\item cat
\item dog
\end{enumerate}

yields:

while the markup

\begin{description}
\item[Parrot] a type of bird
\item[Persian] a type of house cat
\item[Pointer] a type of dog
\end{description}

yields:

A list is normally part of a paragraph, but a list may occur outside of a paragraph in a sectional unit or even at the top level of an article's body.

An item in a list may contain paragraphs but, as the foregoing examples show, may contain loose text.

While LaTeX provides an excellent construction facility for customized lists through its list environment, which may be used in conjunction with its newenvironment command, there is no facility for the provision of on-the-fly markup in SGML. Therefore, the article document type provides several other kinds of list.

The lists called menu and Menu consist of items. These lists have no item labels. They differ only in intention for spacing in presentation formats.

It often makes sense for menu to be used inside an item of another list. Moreover, a single item menu will render in either the LaTeX or HTML formattings as “hanging indentation”.

A defnlist is intended to be more like the list called dl in HTML than like description. It must consist of one or more term, desc pairs with term mandatory and desc optional.

There are also other lists that will not be mentioned here except to point out that when the name verbatim is enabled as a metacommand, rather than as a command, by calling the syntactic translator with a non-default function5 such as verblist or latex-faq, the verbatim material is organized as a verblist.

For more information about lists browse the document type definitions gellmu.dtd and xml/xgellmu.dtd being mindful that default processing in the didactic production system sometimes does a small amount of list manipulation between the SGML and XML versions of an article.

6.  Tabular Environment Emulation

Most users having a LaTeX background will find that tabular with column arguments limited to ‘l’, ‘r’, ‘c’, and ‘p’ is reasonably functional. Improvement made during 2006 were made possible by the increased availability to users of web browsers supporting version 2 of the cascading style sheets (CSS) specification.

For example, the following markup underlies a small lcc table in the section on mathematics.

\display{\begin{tabular}{lcc}
\bold{Name} & \bold{Opener}       & \bold{Closer}   \\
math        & \quostr{\bsl;(}     & \quostr{\bsl;)} \\
tmath       & \quostr{\$} & \quostr{\$}     \\
displaymath & \quostr{\bsl\lsb;}  & \quostr{\bsl\rsb;}
\end{tabular}}

(In this markup bsl is a name for the backslash character and lsb, rsb the names of the square brackets.)

There is presently no concept of “float”.

7.  Labels, Cross References, and Anchors

The basic idea is very simple. Use

 \label{mylabelkey}

to mark a location that is associated with the named label key, and use

 \ref{mylabelkey}

or one of several other methods to reference that location. The command ref may be regarded as the default method. In an online version, however, it has not been designed to create a selectable anchor but rather simply to indicate section number.

For example, in this document there is a label with the key “math” at the beginning of the section (8) on mathematics. The previous sentence contains a reference with the markup \ref{math} to that section. This markup results in the number of that section appearing in the first sentence. For clarity the markup for the entire first sentence of this paragraph is:

For example, in this document there is a label with the key
\qquostr{math} at the beginning of the section (\ref{math}) on
mathematics.

If, on the other hand, one wants an online version to contain a selectable anchor to section 8 to the section on math, then in the preamble of the document one might place the newcommand definition

 \newcommand{\iref}[2]{\anch[iref="#1"]{#2}}

and then, as in the foregoing portion of this sentence use the markup

 \iref{math}{selectable anchor to section \ref{math}} .

Note that this markup uses the label key “math” twice: once as the first argument to the newcommand called iref and once as the sole argument of the command ref that itself appears in the second argument of iref.

On the question of whether to use ref or one of several other commands to make a reference, there are several considerations:

1. How should the reference be indicated in an online format that will be viewed in an ebook or a web browser?

2. How should the reference be indicated in a typeset format where there is neither a cursor nor the possibility of automatic scrolling?

More details on this subject may be found in the Manual, key “labelref.

8.  Mathematics

GELLMU article provides 5 containers for mathematical content: math, tmath, displaymath, equation, eqnarray. All may be referenced by name. However, LaTeX-like markup is supported:

 Name Opener Closer math  tmath  displaymath 

The containers math and tmath are essentially equivalent non-displayed containers. The current LaTeX formatter provides a tiny amount of extra horizontal space around math but not around tmath.

${\mscript{E}{}{}{2}{\vect[]{p}{q}} = H^p\aF(B, H^q\aF(F))} \abuts H^{*}\aF(X) \ \eos$

In this markup the command \aF, is not strictly necessary. It provides a deliberate way to indicate that, in the second instance above, the symbol $X$ is argument-like in relation to the symbol ${H}^{*}$. In MathML rendering aF gives rise to presentation-level MathML's invisible operator ApplyFunction.

9.  Beyond this Introductory Guide

More information about GELLMU is available in the Manual. Some examples of topics to be found there include:

The author tries to keep fresh information on the GELLMU web site.

Footnotes

1. * This should not be construed to mean, however, that standard LaTeX processing could not incorporate this type of source markup at some time in the future.
2. * The author prefers the freely available cross-platform browser Firefox (http://www.mozilla.com/) from the Mozilla Project.
3. * In GELLMU source, as in LaTeX source ‘~’ is markup for “non-breaking space”. Note further that ASCII tilde “\tld;” should not be confused with either the accent command “\tilde” or the math command “\sim” (rendered as $\sim$) that is typically used for mathematical operators.
4. * The SGML document type definition is more elaborately commented than its XML mirror found in the file xml/xgellmu.dtd.
5. * The most convenient way to call the syntactic translator with a non-default function is to make the first line of the source file point to the name of the special function. For example, the function “gellmu-verblist” will be called if the first line of the source file is “%!verblist”.
6. * Similar intellectual property issues pertain more widely to fonts. In particular, for this reason by default in the XHTML+MathML output character coloring is used to cover possible font failures for the blackboard bold, calligraphic, and fraktur math fonts; the colors corresponding colors are, respectively, red, blue, and green.
7. * Beyond that in certain situations some web browsers appear not to deal well with empty table cells, and placing an HTML non-breaking space in such a cell is often a good thing for that reason. Of course, the special character ‘~’ for non-breaking space in GELLMU source gives rise in HTML output to the SGML non-breaking space character denoted with the entity reference “&#xA0;”.