
Stop letting your agents write Markdown.
Audio Summary
AI Summary
The discussion revolves around the limitations of Markdown and the increasing effectiveness of HTML, particularly when working with AI agents. While Markdown has been a popular choice for nearly two decades due to its simplicity, portability, and rich text capabilities, there's a growing sentiment that it's being overused and becoming a restrictive format as AI agents become more powerful.
Thoric from the Claude Code team published articles titled "The Unreasonable Effectiveness of HTML" and "HTML is the New Markdown," advocating for a shift from Markdown to HTML for almost all agent-generated content. He notes that he has stopped writing Markdown files and now uses Claude Code to generate HTML. This perspective is echoed by others, including Carpathy, who suggests asking AI models to structure responses as HTML.
The main argument for HTML's effectiveness lies in its ability to convey much richer information compared to Markdown. HTML can support complex structures like tables, design elements with CSS, illustrations with SVGs, code snippets with script tags, interactive components with JavaScript, workflows, spatial data using absolute positions and canvases, and images. This makes HTML a highly efficient way for models to communicate in-depth information and for users to interact with it.
One significant advantage of HTML is its superior information density. A single HTML file can integrate various types of information that are difficult or impossible to include effectively in Markdown. For instance, models often hallucinate B64 encodings or URLs for images in Markdown, making image integration problematic. HTML, however, natively supports image tags.
Visual clarity and ease of reading are also key benefits. As AI agents produce more complex and lengthy outputs, such as specs and plans, Markdown files exceeding 100 lines become difficult to read and share. HTML documents, by contrast, can be visually organized with tabs, illustrations, and links, making them easier to navigate and digest. They can even be mobile-responsive, adapting to different form factors.
Sharing HTML files is also more straightforward. Unlike Markdown files, which often need to be attached to emails or messages, HTML files can be uploaded (e.g., to S3) and shared via a simple link, allowing colleagues to access and reference them easily from any browser.
The interactive potential of HTML is another strong point. Users can ask agents to include interactive elements like sliders or knobs to adjust designs or tweak algorithm options, allowing for real-time experimentation. These changes can then be copied back into a prompt for further iteration with the agent. This two-way interaction creates a more dynamic and engaging user experience.
Claude Code is highlighted as a powerful tool for generating HTML files due to its ability to access and synthesize information from various contexts. It can read through code folders, categorize files, and generate diagrams, as seen in the examples provided. Beyond the file system, Claude Code can leverage context from communication platforms like Slack, project management tools like Linear, web browsers, and Git history. This extensive context allows agents to produce highly relevant and informative HTML outputs.
The "joyful" aspect of creating HTML documents with Claude is also mentioned. The increased engagement and investment in the creation process, leading to better results and more effective steering of the agent, is a significant, albeit subjective, benefit. This aligns with the idea that tools that excite developers and encourage involvement can lead to better outcomes.
For those starting with HTML generation, it's recommended to begin by simply asking the model to create an HTML file or artifact. Over time, as users understand what works and what doesn't, they can develop more sophisticated "skills" or prompts. The initial focus should be on understanding how the model responds to different scenarios and what information needs to be included.
HTML is particularly useful for specs, planning, and exploration. Instead of generating a simple Markdown plan, an agent can create a "web" of HTML files, brainstorming different options, expanding on selected ideas with mockups or code snippets, and finally producing a detailed implementation plan. These HTML files can then be passed to a new agent session for implementation, providing comprehensive context.
HTML also enhances verification processes. A verification agent can read these rich HTML files, gaining a much broader context for what needs to be verified, leading to more thorough checks.
For code review and understanding, HTML offers advantages over Markdown, especially for rendering diffs, annotations, flowcharts, and modules. While Markdown can display syntax-highlighted code blocks, HTML can provide a more visually organized and interactive view of code changes, potentially surpassing the default GitHub diff view. The concept of "Devon Review," which reorganizes code review based on importance and related changes, aligns with the potential of HTML to create a better hierarchy for code understanding.
Designs and prototypes benefit greatly from HTML's expressiveness. Claude can sketch designs in HTML, even if the final product is in another framework like React or Swift. It can also prototype interactions like animations and actions, allowing users to fine-tune designs with interactive elements.
Reports, research, and learning are another strong use case. Claude Code can synthesize information from multiple data sources (Slack, codebase, Git history, internet) to generate highly readable reports for various audiences, from team members to leadership. This is presented as a "crazy hack" for career growth, allowing individuals to present polished, informative HTML reports that make leaders feel more informed.
Custom editing interfaces can also be created using HTML. For complex data or scenarios where a simple text box is insufficient, an agent can build a "throwaway editor" in a single HTML file, purpose-built for a specific piece of data. This challenges the traditional notion that code should only be written for reusable components, emphasizing that cheap, one-off code can be highly valuable for making better decisions. These custom UIs should ideally include export functions (e.g., "copy as JSON" or "copy as prompt") to easily transfer data back to the agent.
Regarding concerns about token efficiency, while HTML might use more tokens than Markdown, the increased expressiveness and higher likelihood of user engagement often lead to better overall outputs. With large context windows, the increased token usage becomes less noticeable.
The question of when to use Markdown is addressed, with some, like Thor, claiming to have almost entirely moved away from it. However, this is seen as an extreme "HTML maximalist" position, with many still finding value in Markdown for certain simple tasks.
Version control is identified as a significant downside of HTML, as HTML diffs are typically noisy and hard to review compared to Markdown. To address the aesthetic aspect, users can point Claude at their codebase to create a design system HTML file, which can then be referenced for consistency in other HTML outputs.
Carpathy further emphasizes the shift, stating that asking an LLM to structure responses as HTML and viewing them in a browser works "really well." He also notes success with slideshows. His broader view is that while audio is the human-preferred input for AIs, vision (images, animations, video) is the preferred output. He hypothesizes a progression from raw text to Markdown, then to HTML as a "forming new good default," eventually leading to interactive neural videos and simulations where the UI is streamed live directly from a model, without traditional HTML or layout engines.
However, Carpathy also points out that improvements are needed in input methods, suggesting a need for the ability to point and gesture on screen, similar to interacting with a person. This highlights the current limitations of text-only or audio-only inputs.
The overall sentiment is that the "input and output mind meld between humans and AIs is ongoing," and there's significant progress to be made in output formats and interfaces. HTML is presented as a crucial starting point in this evolution, enabling more effective software development through better output formats, UIs, and greater control and customizability. While interactive videos and simulations are a long-term vision, immediate steps involve leveraging HTML's capabilities.