I hate all “lightweight” markup languages
As a part-time tech writer for projects such as LTTng, Babeltrace, and barectf, I've had the chance (or misfortune) to work several hundred hours with different “lightweight” markup languages, in particular:
- AsciiDoc: the original and Asciidoctor
- DokuWiki
- Doxygen
- Markdown: GFM, Markdown Extra, and kramdown
- MediaWiki
- reStructuredText
- Textile
This post shows the state of the art and provides my informed opinion on the markup languages I've worked with the most.
tl;dr: I hate all of them.
Or rather, I don't fully appreciate any of them: when one of them doesn't look like a tarmac tragedy, it's simply incomplete.
That being said, I use many of them almost daily.
In my opinion, only DocBook and HTML provide all the semantic elements that I need to properly document software. But those are somewhat heavy to write manually: I much prefer writing AsciiDoc or Markdown.
Keep in mind that my output targets are HTML and sometimes groff manual pages (which must have their HTML equivalents): I don't really care about PDF, for example.
Here's a DocBook example:
<screen>
<prompt>$</prompt> <command>cd</command> <filename>/usr/lib/pkgconfig</filename>
<prompt>$</prompt> <command>ls</command> *zlib* -la
<computeroutput>-rw-r--r-- 1 root root 292 Jan 2 06:17 zlib-ng.pc
-rw-r--r-- 1 root root 252 May 2 2024 zlib.pc
</computeroutput>
<prompt>$</prompt> <command>sudo</command> -i
<prompt>#</prompt> <command>rm</command> <filename>zlib-ng.pc</filename>
</screen>
Imagine the rich output that's possible thanks to so much semantics at the source! In HTML/CSS, we can, for example, make the prompts nonselectable so that you only copy the command lines.
Unfortunately, I don't know any markup language which can produce this DocBook with a lighter syntax. We have to rely on hacks to deduce semantics from hints placed within the plain text itself.
Throughout this article, you'll notice that I'm extremely uncompromising when it comes to features that are almost never useful, such as adding a code snippet to a definition list item.
On the other hand, why not? What irritates me the most is that most of the languages discussed here could be much more flexible and powerful, but they choose not to be somehow.
As a tech writer, the most important features for me are:
- Lists
I want to be able to put anything within a list item, including other lists, tables, images, code blocks, and the rest.
A very important type of list which I couldn't live without is the definition list. This list associates one or more “terms” to a “definition”. I use quotes here because you can really use a definition list for anything matching multiple associations of multiple keys to a value. In a way, it's halfway between a two-column table and an unordered list with the terms in bold (for example). This is
<dl>
in HTML and<variablelist>
in DocBook.- Code blocks
A critical feature for software documentation.
Mandatory to be able to specify the programming language (if any) to enable syntax highlighting where available.
A nice-to-have here is a callout feature, with which you can place little numbers within the code block and then refer to them in subsequent prose.
- Tables
Tech writing usually involves a lot of tables. Like with list items, I want to be able to put anything within a table cell: lists, images, code blocks (rare but why not), and more.
Furthermore, I need to be able to specify a horizontal (top) header, and ideally a vertical (left) header too when needed.
Let's explore what the open-source community has to offer us as of writing this post.
AsciiDoc
Incontestably my favorite, AsciiDoc is the most featureful “lightweight” markup language out of the box: you can write a complete book in AsciiDoc.
AsciiDoc was born in 2002 with its official Python 2 processor. Although the original version is now pretty much abandoned, some people decided to port it to Python 3, and it seems like it's alive again (last version was July 2024 as of this date). It also seems like it's wanting to be a processor implementation which honours the somewhat new AsciiDoc standard (more about this below).
The Python AsciiDoc legacy remains, however: many projects still depend on it to document their various parts. Amongst other things, it's really easy to write beautiful, real manual pages with the Python AsciiDoc flavor. Another important advantage is that it supports macros with its own (“interesting”) language.
The “new” flavor is the Ruby-written Asciidoctor processor. Its team, most notably Dan Allen, is fully dedicated to the AsciiDoc cause. They even created an Eclipse working group (WG) to try to standardize the language. The WG has famous members such as Red Hat and VMware. As a power user of the language, I was actually excited about this opportunity to fix the major pain points of AsciiDoc, but I was too early in the WG creation process and I kind of forgot about it afterwards. Looking at the AsciiDoc Language Documentation of Asciidoctor now, which seems to act as the official AsciiDoc documentation, I can see that nothing really changed 😢.
The following sections show the power of AsciiDoc as well as its occasional unsightliness.
Macros 😀
Macros are actually what made me switch from Markdown to AsciiDoc.
I'm specifically talking about the macro system of the Python processor here. The one of Asciidoctor seems just as powerful, but it requires to write extensions in Ruby, which, unfortunately, is a language I don't speak.
Attributes 😀
Meow mix.
List continuation 🙁
Meow mix.
Where's the lamb sauce? 😠
Meow mix.
A note about GitHub rendering
Meow mix.