Markdown is for reading, but HTML is for doing; this shift marks the transition of AI from a mere chatbot to a functional interface layer. It is a necessary evolution that prioritizes utility and interactivity over the nostalgic simplicity of plain text.
Deep Dive
Voraussetzung
- Keine Daten verfügbar.
Nächste Schritte
- Keine Daten verfügbar.
Deep Dive
HTML vs. Markdown: Pushing AI Agent Responses to the Next LevelHinzugefügt:
So, some of you may have seen this article on X the other day. This is from Theariq from uh Anthropic and he talked about the effectiveness of HTML versus Markdown, which I'm sure we're all used to. Agents tend to use markdown a lot for reports and just for configuration files. markdown has kind of been the go-to, the default uh for agents because they're so easy to to read and write, but he's arguing in this X article that HTML is actually a better format and pretty interesting article. It sparked kind of a debate in AI community.
Thought it was interesting, especially because if you watch my videos, you see a lot of my slides, they're all HTML files. So, I'm pretty familiar with using HTML in cloud code and how effective that can be. I thought this was particularly interesting, so I wanted to do a video about it. So, in this video, I'm going to basically break down what Thri was talking about in his article. Um, and also a tweet from Carpathy today, kind of backing it up, as well as some of the counterarguments against using HTML versus Markdown. And then we're going to take a real look at it. I'm going to give Claude code a somewhat complicated task to write a report for me. We're going to see how it looks in Markdown and then see how it looks in HTML. So you can kind of see visually the difference.
And if you like this video, please subscribe to my free weekly newsletter where I give my honest thoughts about the week's AI news that I can't share here, as well as interesting research and papers that I found and what projects I've been building behind the scenes. There's a link in the description or go to onchain.com.
Every word is written by me, mistakes and all. So, if you're sick of AI slop articles or all hype with no substance, subscribe and give AI Garage Weekly a shot. Now, back to the video. So, the main thesis from The is pretty simple.
He's saying that as models get more capable, the format we ask them to render in is becoming the bottleneck.
But first, let me kind of give you a breakdown of what markdown is. Um, and it one default output or for genuinely good reasons. The syntax is plain text, so it round trips through any tool, any pipe, any terminal. You can use it in any kind of editor. It supports headers, bullets, code blocks, tables, and links.
And it has enough rich text capability to handle 80% of agent output. It's spellch checkable, if friendly in git, and trivial for humans to handedit. I'm sure most of us have edited some markdown files in VS Code or elsewhere.
So there's nothing here that's actually wrong. Markdown is perfectly fine. So [snorts] the argument that he's making isn't that markdown is bad. It's just that we've outgrown it and there's a better option.
So framing of the whole piece was that markdown started as an enabling format and has actually become a restricting format now. Um so the line he leads with here you can see his quote. I find it difficult to read a markdown file of more than 100 lines. I want richer visualizations, colors and diagrams. I want to be able to share them easily. So this is a practical problem. So the model we have now is going to be happy to generate 500 line specs very quickly.
But as humans, it's not so easy to read that much text. So argument one is just information density. HTML's vocabulary is just enormously wider than markdowns.
You get tables, design, illustrations, code, interactions, spatial layouts, images, video, embedded scripts. Greek's strong claim is that almost no information claude can read is something it can't represent in HTML markdown capped out somewhere around bullet list and fenced code and when you constrain a capable model to a thin format improvises around the constraint. So Claude has gotten generally good at ASCI diagrams, at building little tree structures with box drawing characters, even at approximating colors with shaded unit block codes.
That's an impressive workound for a problem we don't need to have. And when you're using HTML, you can use these different elements here. Input, CSS, image, tags, position. There's a lot of different ways to kind of format and display the same data.
So the argument two that made is visual clarity and this is the readability argument. As the model writes longer plans, specs and analysis, a wall of markdown stops being scannable. And three is candid that he himself doesn't read past 100 lines. Uh I myself will admit I don't usually read the spec files that are in markdown more than just kind of skimming it. Um and he says he definitely can't get his teammates to do it either. The artifact you can can't get anyone to open is worth less than the shorter one they will.
And HTML lets CL organize structure visually. So it can even be mobile responsive so the same document reads differently on a phone than at a desk monitor. And the fix is a structure that markdown just can't express. Tabs to compartmentalize sections. Sidebars and tables of contents. Collapsible details.
Illustrations next to the pros they describe. Hyperlinks that go somewhere useful. mobile responsive layout.
The same 500 lines of content that you get in kind of a spec file that you just have to scroll mindlessly becomes this kind of scrollable and navigable format which the reader can actually finish.
So argument three he makes is that it's just a lot easier to share. And this one is kind of mundane, but it's very real.
Browsers don't render Markdown natively.
If you want to share a markdown file with a non-technical colleague, you're either copy pasting it into a Google doc, taking a screenshot of it, or asking them to install something, which nobody wants to actually do. With HTML, you can just drop the file anywhere on any kind of static host and send a link and then you're done. Anyone can open up easily in a browser and it looks exactly the way you wrote it. So the chance of someone actually reading your spec is much higher if it's in HTML.
The argument four he made was two-way interaction. And this is the one that's hard to do in any other format. You can ask Claw to render a design with sliders, drop downs, toggles, actual controls that you can manipulate. Then you could do like a copy as a prompt button down here that captures your tweaks and feeds them back to Claude Code as the next instruction. So the artifact actually becomes the interface.
So, say you're adjusting this button here. You can tweak it.
Change the shadow as well.
Make it less, make it more.
You can change the different styles and then just do copy as a prompt and then add that back into claw code to tell it the way the exact type of button that you want. So argument five is data ingestion and specifically why claude code is the right tool for this. Um obviously the is a little bit biased because he works for cloud code but it says that the agent in your terminal can read your file system hit MCP servers like slack or linear walk git history or even drive a browser. So that context turns make an HTML page from a generic task into one that's grounded in your actual data your actual code and your actual conversations. So now the counter arguments and three engages these directly. So the real objections are that HTML costs more tokens to generate and it takes meaningfully longer to produce. It's a hassle to view sometimes. You have to open it somewhere. Markdown files open in any editor. Version control. Um it makes gits diff noisy because tags and attributes can change a lot. Uh and these are legitimate arguments certainly. Editability as well.
markdown. You can just quickly open it and edit it however you like. HTML, you have to actually get the agent to to change the HTML file unless you want to go in and edit the code directly yourself.
Um, but he does have rebuttals for these arguments. On the issue of tokens, so with million token context windows now standard for Frontier models, a few thousand extra tokens for tags is just kind of noise on time. Yes, it is is slower. He says it's around two to four times slower to generate, but the resulting artifact is dramatically more useful and usually you only need to generate it once.
On terms of viewing, um he says you can open it locally or push to an S3 for a sharable URL. So neither is especially hard to do. And the diff noise is the real cost. And he agrees with this. He conceds this one. HTML diffs are noisy and that is perhaps a a real cost.
Um, and then today, uh, Andre Karpathy had a tweet kind of agreeing with this.
You see the whole thing here. Um, he kind of agreed that structuring responses as HTML gets a better response because it's much more visual. He goes in a lot more depth about kind of what the future holds for AI responses.
Um, he said here, "This works really well, by the way. At the end of your query, ask your LM to structure responses in HTML. Then view the generated file in your browser. also had some success asking LM to present its output as slides etc. So like I said then he kind of makes the argument a lot bigger and he [snorts] sketches it as a progression of output formats in increasing order of human friendliness.
So step one was raw text. So that's an effort to parse obviously. Step two is markdown which is the current default much easier on the eyes much more formatted and clean.
Step three is HTML and this is what we're discussing here, right? More flexibility in terms of graphics, layout, interactability and the format he thinks is forming a new default as we speak. Then he talks about kind of f further steps at four, five and six eventually heading in the end to this kind of interactive generated video, interactive neural video. He calls it diffusion generated kind of an interesting futuristic topic. He goes in to the end talking about this.
So why does vision win as an output channel? Carpathy points out that roughly a third of the human brain is massively parallel processor dedicated specifically to visual input. It is in his words the highest bandwidth type into the brain we have. If you're optimizing for humans getting information, you should be optimizing for a channel that can actually absorb at speed. And walls of text just don't do that.
The practical takeaway from all this is that you don't need a special scale, a plugin, or a new tool to do any of this today. Just add ask Claude um or any agent that you're using to render the output as HTML file instead of a markdown. So the interesting work isn't just asking that. It's being specific about what you want to do with the artifact. What it should show, what it should be interactive, who's going to read it. Treat the output format as a design decision, not as a default. The big payoff is the feeling that you're back in the loop with cloud again.
Instead of just skimming a plan you never have read, you're actually looking and understanding a lot.
And I think that should be one of the major goals for a lot of these AI tools is to enhance your own understanding.
Not just getting to the end of whatever project or whatever task you're trying to do, but actually try to enhance your understanding as you do that. So lastly, I'm going to try to show you this uh giving FOD the same task but having it produce one as a markdown file report and one is an HTML just so you can see visually quickly uh the difference and why the is kind of suggesting you use HTML in this way.
Uh you can see I have two instances of cloud code here and I'm going to give it the prompt research the difference between MTP and Dlash. write a comprehensive report about these differences and present me the findings in a markdown file. So that's going to be the same prompt I give it.
Um, and we're going to do one as a markdown file, one as an HTML file.
Okay, so the oneclud is done and it was faster than the uh the HTML one. So we'll see how long the HTML one takes.
This is the the markdown file MTP versus Dlash.
And you may have seen a previous version of this video that I had posted just using uh this as a comparison against HTML. And I got a lot of comments saying that it wasn't quite fair uh cuz I'm not using a proper viewer. So I thought that was a fair point. So I'm going to actually show you this is the raw code.
This is a raw file in VS Code. Um so I'll show you what it looks like in an actual proper viewer.
There you go. So, it does look better.
Certainly, the reason why I didn't do this in a previous video is I thought that having to add the overhead of using like Obsidian or something, this is in VS Code, which is fine for most of us, but a lot of nontechnical people are not going to have VS Code or necessarily know to view these files in a certain tool or extension, whereas an HTML file just opens up in a browser that everyone has. But give you a fair comparison.
This is the uh markdown file in a proper view or NVS code. And you see the it did fix the um the chart issues that it were having. If you look at in the raw code, it looks like that. It does look better.
Um it's still a lot of text. I mean, but you have like some some basic formatting stuff like bold text still.
um better than just raw attacks certainly.
You look down here.
Uh I think the other Yeah, the other chart is formatted properly.
Uh but it's still just kind of plain text. Even this actually the conceptual diagram even in this preview you could see um it's kind of broken still even in the previewer I'll get a lot of comments saying no you can actually do it this right but um this is just in the previewer in VS Code but you can see there's still some issues here like it still looks a bit off um same issue with this these lines not matching up correctly uh so there you go that is the markdown file in a proper viewer and there's nothing wrong with this if you're willing to sit through and read all this text. Um, can certainly do that. I think the point that the reek and then Carpathy after him were making is that it's still quite a bit of just text you're reading whereas HTML gives you a lot more options in how to format it. Now, if you're in Obsidian or something, you can add diagrams and stuff like that, but like I said, that's just extra steps. Not everyone necessarily wants to download Obsidian, get that set up, figure out how to add, you know, mermaid charts or whatever, but let's look over at what um the HTML version of this was. Okay, so this is the same prompt, but just telling it to write this in HTML. And this is I just opened this up in the browser. And you can see already from the start, um it has stuff that the the markdown file, even in preview, even in a viewer, doesn't have. it doesn't have this kind of color. Um, you see here, this is kind of table of contents. So, you can pick one of these and go down automatically.
Um, you can see it has charts like this that are a little bit more formatted nicely.
It has everything that Markdown has, but just a couple extra things that make it really nice to view. It did take longer, probably took twice as long to produce, but if you are producing this report for people to read on a more difficult topic, something being able to view it visually I think is really important.
Um, so you can see the chart has uh this color difference. The MTP part is blue, Dlash is kind of this orang-ish color.
That makes it a lot easier to read as well, even compared to to this chart.
And you can see down here being able to format it like in these separate cards side by side like this with the color differences with bullet points and everything like this. Um certainly an advantage. And this is just a plain HTML. It's not a slideshow or anything.
So there you go. You got a little code snippet nicely formatted. Nothing broken in there. So, especially if you're going to share this with other people, um I think it's pretty clear which one is a more visual or appealing option even in the the viewer.
And it's not to say that HTML is perfect for everything. A lot of the configuration files, files that you just need the agents to read, you don't need any humans to read, those are fine for markdown. Those are really easy for agents to read without any issue. So, those are still perfectly fine for markdown. And markdown is not a bad option. I mean, this looks okay.
Uh, but we're just talking about bringing it to the next level. A little bit more visual, a little bit cleaner.
And nice thing about this is that we can build on this and try to reformat it, enhance it even more, and with a simple prompt to your agent. And I think that's what we're going to do right now. I'm in my agent right now who has a lot of skills and I'm going to convert what you just saw into a slideshow and then use image gen. I'm using uh GPD 5.5 here to add images that explain the concepts even further. So, we're going to convert what you just saw to a slideshow. And while it's not yet a neural interactive video like Karpathy imagines we might have in the distant future, you can actually convert these HTMLs pretty easily into videos of different formats.
I'm going to be using hyperframes, which my agent skill in my Hermes agent, but there's also stuff like Manom video, and I'm sure there's other types of video generation skills and tools out there.
So, this is just one example of kind of taking it to the next level. What we might see if it becomes easy enough to do after we kind of evolve from just HTML.
You can see what we got here. It took it did take a while. It took like 15 minutes, but um this is the the slideshow we got. We got some nice images here that were made. GBT images 2.0.
And some of this you'd have to kind of fix, but be much much better a lot clearer. And this is just kind of ways you can upgrade if you want to present this, especially if you're providing it to other people. You want them to actually read through it and understand it. You can make it look really nice like this. So this is the same exact data, the same exact report, just hold it to make a slideshow and use images. And that's what we get. So the last step to kind of push this to the next level is to make this a video.
Okay. So this is the video um I made using hyperframes and it's kind of a vertical short style. Try to give you a little different look from the slides that we had earlier. You see it's the same concept the MTP versus D [clears throat] flash image and this is just a video. Uh pretty good visualization though.
I know a lot of people are are used to this kind of short vertical videos.
But if you wanted to do this to kind of explain to people uh on your team or just the larger audience that you might have uh you can do something like this. And I think what um Arpathy is talking about is something a little bit more interactive. Like you could click on certain terms here, then have like definitions popping up and stuff like that. Something a little bit more interactive. But just to kind of give you an image of what it what's out there today in terms of visual tools for your uh models responses, you could do something like this to make it a video.
And it's certainly a lot more attractive and gonna keep your attention more than something like a markdown file. So not necessarily practical right now for everyday use, but just kind of an image of the direction we're headed towards.
There you go. So that's going to be it.
I thought this was an interesting topic.
Uh it's been some conversation on X. A lot of people have been talking about this. I know there's a lot of other smaller formats that people have been recommending, but I think the kind of the benefit of HTML is that everyone knows HTML. It's accessible anywhere.
You don't need to download any special tools or editors or anything like that.
Everyone has a browser you can open HTML in.
And hey, maybe down the road we get this interactive neural video which would be pretty pretty cool. But this is where we are right now with HTML and markdown.
But that's going to be it for this video. I hope you enjoyed it. Please leave a comment. Let me know your thoughts on this topic. If you like this video, please subscribe, leave a like, and I'll see you in the next video.
Ähnliche Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
🚀 BCS613C Compiler Design | Module 1 to 5 Schema Evaluation 🔥 | VTU 6th Sem 💯 #VTU #bcs613c #exam
Pranavaa-y4y
104 views•2026-06-02











