# The Challenge of Translating LaTeX to HTML

#### March 28, 2014

The idea that LaTeX documents, as they have been found in
circulation during the period 1985–2014, could be
translated to a formally structured SGML document type is,
in a certain sense, folly just as it has been folly to
imagine that more than 5% of the HTML documents in
circulation during the period 1995–2014 are formally
correct. Beyond that, the problem in translating LaTeX is
compounded by the fact that the principal LaTeX engine
implements LaTeX, the language, as a macro package under
TeX, which is a Turing-complete programming language. So
far in the development of LaTeX, there has been no reliably
enforced boundary between LaTeX and TeX. In view of all of
this it is remarkable, even astounding, that there has been
any substantial degree of success with any translation project.

That said, because there is such a large legacy of
documents written in LaTeX source, there have been a number
of valiant projects mounted since the late 1990s for the
purpose of attempting to translate LaTeX to HTML. Because
math is an important part of LaTeX, one wants an automated
translation of LaTeX to HTML to include provision for math,
i.e., to generate MathML for math. Even where MathJax will
be used to facilitate web browser rendering of math and even
though MathJax will accept LaTeX-like input (not actual
LaTeX but close), the providers of MathJax have clearly
stated that automated translations should use MathML.

There are two of these translators that I have found
useful when finding myself faced with the task of translating
legacy LaTeX to HTML. In most cases such documents come
from *arXiv*, and I have
gathered some examples from *arXiv* that were handled
by both of these translators.

- LaTeXML
- Tex4ht

No translator that I know of has a success rate that
is sufficient for fully automated translation of arbitrary
LaTeX documents to HTML.

What is true is that a high degree of reliability may
be had by using profiled LaTeX and configuring the translator
to accommodate the profile.

For my documents since the fall of 1998 I have been using
my own project
“Generalized
Extensible LaTeX-Like Markup (GELLMU)”, which, however,
does not attempt to provide translation for LaTeX
documents but rather provides a didactic formalized LaTeX
profile. For more on formally profiled LaTeX see my talk at TUG 2010 on
LaTeX
Profiles.