
Markdown is a terrible language
Audio Summary
AI Summary
The speaker, a long-time advocate for Markdown, expresses growing disillusionment with its widespread use and inherent limitations, arguing that it has become a "Frankenstein's monster" that fails at both its intended purpose and the programming-like tasks it's increasingly forced to handle.
Initially, the speaker was a staunch supporter of Markdown, even to the point of submitting Markdown resumes for job applications. They admired its simplicity and reliability, crediting John Gruber and Aaron Swartz for creating a "magical" and simple way to write content that renders well. However, the article that sparked this discussion, "Why the Heck Are We Still Using Markdown?", challenged this view, and the speaker now agrees that Markdown has been overextended, especially with the introduction of new file extensions for embedding content and its current role in communication between LLMs and agents.
The core of the critique lies in Markdown's fundamental design, which the speaker contrasts with its current complex usage. Markdown's strength, according to the speaker, was its minimalism and legibility, designed for the simple task of converting plain text into HTML. Its learning curve was virtually nonexistent, and like C, the output was predictable from the input.
However, the speaker highlights significant issues:
**1. Ambiguous and Inconsistent Syntax:** Markdown suffers from unclear specifications, leading to feature creep and multiple ways to achieve the same output. For instance, bold and italic text can be written in various ways (e.g., `**bold**`, `__bold__`, `<b>bold</b>`), and this inconsistency extends to other elements like headings and lists. This ambiguity makes parsing difficult and prone to errors.
**2. The "Bad" of Markdown:** The speaker argues that the language itself is flawed. The existence of multiple syntaxes for the same element, such as headers (`#` vs. ATX syntax) and bold/italics, demonstrates a lack of clear direction. This leads to situations where different inputs produce identical outputs, and conversely, slight variations in input can lead to wildly different interpretations by parsers.
**3. Security Vulnerabilities:** The flexibility of Markdown, particularly its allowance of inline HTML and other embedded content, creates significant security risks. The speaker points to ReDoS (regular expression denial of service) vulnerabilities, where specially crafted inputs can overwhelm parsers, and cross-site scripting (XSS) vulnerabilities, which are recurring problems across major Markdown implementations. The ease with which these vulnerabilities can be exploited is attributed to the complexity introduced by trying to extend Markdown beyond its original scope.
**4. Inline HTML as a Symptom:** The integration of inline HTML into Markdown is seen as a critical flaw. The speaker argues that if one needs to embed complex HTML, they might as well start with HTML directly. This practice, while allowing for more elaborate formatting and functionality, defeats Markdown's purpose of simplicity and introduces the need for robust HTML parsing alongside Markdown parsing, significantly increasing complexity and the attack surface.
**5. The "Cursed" Nature of Modern Markdown:** The speaker uses the term "cursed" to describe the convoluted and often insecure ways developers are forced to use Markdown for tasks it wasn't designed for. This includes embedding images within links, complex HTML structures, and even attempts to incorporate asynchronous JavaScript execution within the markup. The speaker's own blog is cited as an example of this "cursed" usage.
**6. Legacy and Obscure Syntax:** Markdown's origins in plain text conventions of email and Usenet have left it with legacy syntax that can be obscure and problematic. The speaker points to the historical use of colons for quoting and pipes for vertical separation as influences that, while historically relevant, don't translate well into modern, robust markup.
**7. Context-Sensitive Grammar:** The speaker delves into the technical aspect of parsing, explaining how features like footnotes transform Markdown from a context-free grammar (CFG) into a context-sensitive grammar (CSG). This means the meaning of a token can depend on other parts of the document, making parsing significantly more complex and moving it away from simple, predictable conversion.
**8. The Escalation of Complexity:** The speaker illustrates how the desire for more features transforms Markdown from a simple transliterator to a full-blown compiler. What starts with a need for footnotes escalates to requiring custom callouts, math typesetting, dependency graphs, and complex CSS management, turning a simple tool into a system requiring sophisticated build processes. This is contrasted with the simplicity of Notion's use of Markdown as hotkeys, which is deemed a more appropriate application of familiar syntax.
**9. Alternative Solutions and Their Flaws:** The speaker briefly touches upon potential alternatives like plain text, reStructuredText, and MDX, but finds them lacking. Plain text is not universally understandable, reStructuredText is difficult to write, and MDX is seen as too focused on being HTML. A significant missing element in all these solutions is a proper build system.
**10. The Need for a Better Solution:** The speaker concludes that the current state of Markdown is unsustainable. The ideal solution would involve a custom-built tool with a sane, unambiguous, and legible syntax, purpose-built for its intended use. This tool should avoid inline HTML, allow for well-defined shortcodes and functions, and support compile-time hooks with appropriate constraints. The speaker advocates for letting go of Markdown and seeking answers elsewhere, emphasizing the need for a trivially parsable grammar and a robust build system. The speaker also references Jeff Atwood's 2012 plea to John Gruber to standardize or evolve Markdown, noting that this did not happen.
Ultimately, the speaker expresses a newfound appreciation for how deeply flawed Markdown is, despite using it daily. The article has pushed them "over the edge" to recognize its pervasive issues.