% This looks like LaTeX; but it is really SGML. % The "commands" are SGML elements, not TeX macros. % A semi-colon at the end of a "command" forces command name termination. \documenttype{article} \baseloc{http://www.albany.edu/\tld;hammond/gellmu/mui/} \date{October 22, 1999\\ Last re-processing \today} \author{William F. Hammond} \title{The Idea of a Markup Interface for \abbr{XML}} \surtitle{Univ at Albany: W. F. Hammond: The \abbr{GELLMU} Archive} % Now let's squeeze the DVI into two U.S. pages \latexcommand{\bsl;setlength\{\bsl;topmargin\}\{-30bp\}} \latexcommand{\bsl;setlength\{\bsl;textheight\}\{696bp\}} \nobanner % Suppress the "translated from GELLMU" advertisement \nobaseprint % Since the document network location is spelled out % explicitly below, don't provide it routinely for print. \begin{document} \section{The Value of a \latex;-Like Markup User Interface for \abbr{XML}} A markup user interface (\abbr{MUI}) for an \abbr{XML} language (formally \quophrase{\abbr{XML} application}) is a markup language that admits a well-defined translation to the \abbr{XML} language. Recent discussions in \urlanch{news:comp.text.tex} show that at least two of us are thinking about \abbr{MUI}s for \latex;-like \abbr{XML} languages. Jonathan Fine has been working on a \emph{roff}-like \abbr{MUI} that he calls \quophrase{Active TeX}, and I have been working on a \latex;-like \abbr{MUI} that I call \abbr{GELLMU}. As I use the acronym \abbr{MUI}, I do intend it to be reminiscent of the acronym \abbr{GUI} for \quophrase{graphical user interface}. Both \abbr{GUI}s and \abbr{MUI}s have the intention of making life easier, or perhaps at least more familiar, for some authors. An \abbr{MUI}, unlike typical \abbr{GUI}s, has the possibility of giving the author full and rigorous control over content. Furthermore, an \abbr{MUI} in the style of a pre-existing non-\abbr{XML} markup offers a convenient avenue for prototyping a new \abbr{XML} language to model the markup practice in the pre-existing markup. Beyond that, it offers a route for conversion of legacy archives in the pre-existing markup to \abbr{XML} languages with minimal human intervention. Please allow me to say a bit more about what I have in mind for \abbr{GELLMU}. \section{The Basic \abbr{GELLMU} Processing Design} The things that I have on hand, aside from \latex; are: \begin{enumerate} \item \abbr{GNU} Emacs, version 20. (I believe that version 19 is OK.) \item James Clark's \quostr{nsgmls}, a part of his \quostr{SP}. \end{enumerate} My processing set-up is the following: \begin{defnlist} \term{Syntactic translation from \latex;-like markup to \abbr{SGML}} \desc My Elisp processor, which can be run interactively in \abbr{GNU} Emacs or in batch mode, performs syntactic translation to convert \latex;-like markup to an \abbr{SGML} language (formally, application). The syntactic translator is largely ignorant of command names. Whatever command names are used become the names of \abbr{SGML} elements. There is a standard way to convert multiple argument/option sequences. This processing stage traps syntax errors. (It will fail to detect an even number of missing \quochar{\dol} characters; but this error will show in the next stage.) \term{Validating parse of the \abbr{SGML} language} \desc At the validation stage the difference between \abbr{SGML} and \abbr{XML} is significant if one wants to have \quophrase{math mode} be a global toggle since that may be modeled robustly only using \abbr{SGML} exclusions. A validating parse is made using \quostr{nsgmls}. Of course, the language definition, i.e., \abbr{SGML} application definition, is crucial. It is contained in an \abbr{SGML} declaration and in an \abbr{SGML} document type definition (\abbr{DTD}). The language definition that I am using is the heart of my personal production system. But I regard it only as didactic in the larger scheme of things. Others will certainly want things that I do not want. As far as it goes, it models \latex; rather closely in some ways and rather loosely in other ways. Where it departs from close modeling, the reason is usually related to having a document structure that is not print-centric. The validating parse traps errors in language use. \term{Down-translation to XML} \desc This is done with the program called \quostr{sx} in the family of \quostr{SP} processors. While it is possible to recover an equivalent \abbr{SGML} document from the down-translated \abbr{XML}, it is not possible under the \abbr{XML} umbrella to have as precise a language definition as under the wider \abbr{SGML} umbrella. That said, either form may be run through a processor that serves to enforce a tighter language definition. \term{\abbr{SGML} processing} \desc This can go anywhere that is sane for the language definition. One can use any programming language, but it helps to have a basic \abbr{SGML} library on hand. I am using David Megginson's \quostr{SGMLS.pm}, a Perl 5 library, and its interface \quostr{sgmlspl}. In my personal production system I routinely format \abbr{GELLMU} \quophrase{articles} for both \latex; and \abbr{HTML}. Invalid \abbr{HTML} and error messages from \latex; represent bugs in my processors, which I always repair as soon as possible. There may be box size complaints from \tex;; they represent authoring content errors. These two formatters work either on the \abbr{SGML} or the \abbr{XML} version of an article. At present my personal production language definition is not fully up-to-speed for journal articles nor for translation to XHTML-with-MathML, but it is serving me well for my classes. It gives me a sane way to have course handouts and web offerings in a bullet-proof way from a single source. At this time I simply cannot assume that more than about a third of my students have easy access to \abbr{PDF} readers. So I feel constrained to give them simple \abbr{HTML} with very limited pseudo-TeX for math. Other formattings for \abbr{GELLMU} article that should be possible include translations to (1) DocBook, (2) TEI, and (3) Texinfo, though suboptimally so long as math is not available. These are not small jobs, and I may never undertake any of them. (Any project that undertakes to format an \abbr{XML} language in Texinfo should give serious consideration first to modeling Texinfo as an \abbr{XML}. It would also be desirable to minimize Info/\tex; bifurcation in XTexinfo and to provide math for XTexinfo.) If a time arrives when I can assume that three-fourths of my students have out-of-the-box browsers for XHTML-with-MathML, then I should be able to format the original \abbr{GELLMU} source directly for that --- if by that time I am not able to squeeze it out of carefully-set LaTeX-4. (I exaggerate somewhat. In principle, I need to provide some math symbol declarations in the sources, and math symbol declaration handling is not yet present in my set-up.) \end{defnlist} \section{About this document} This document was prepared as a \abbr{GELLMU} \quophrase{article}. A copy of the \abbr{HTML} version of this document is available on the web at \display{\urlanch{http://www.albany.edu/\tld;hammond/gellmu/mui/}} along with a full list of anchored versions as follows: \begin{menu} \item \abbr{GELLMU} source: \urlanch{mui.glm}. \item \abbr{SGML}, \quophrase{article} document type: \urlanch{mui.sgml}. \item \abbr{XML}, \quophrase{article} document type: \urlanch{mui.xml}. \item formatting in \latex;: \urlanch{mui.ltx}. \item \abbr{DVI} made from \latex;: \urlanch{mui.dvi}. \item \abbr{PDF} made from \latex;: \urlanch{mui.pdf}. \item formatting in \abbr{HTML}: \urlanch{mui.html}. \end{menu} \end{document}