Casas brilliantly argues that we are still in the "radio with cameras" phase of generative UI, mistakenly treating fluid agent capabilities as static components. This talk provides a vital roadmap for moving beyond sandboxed widgets toward truly personalized, collaborative human-agent experiences.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Beyond Components: Designing Generative UI for MCP Apps — Ruben Casas, PostmanAdded:
[music] >> Hello everybody. So, I know I am the person standing between you and your lunch. And but this is going to be a very interesting talk that combines the the previous two talks into the future.
And that's what I want to talk about today.
So, back in November 2022, what we used to do was we used to go to ChatGPT and ask ChatGPT to create a component. And we would just copy paste. You have to ask reply in code blocks. Then, you know, again, fix it, repeat. And this is what I call the poor man's by coding.
And we have come a long way.
It kind of worked. It was very exciting.
You could get models to could actually build some UI for you.
And I'm sure that was not going to write better code than me, right?
And then things improved very very rapidly, very very fast.
What happened last year, and if you are aware of what happened in the the the last months of 2025 was this acceleration, an incredible inflection point where things changed. And it will go down in the history books as things changed very fast all at once.
And this is in part because of the release of two very important models, which were 5.2 ChatGPT, sorry. It was yeah, a GPT 5.2 and Opus 4.5.
And they were not just very good at uh most of the task, long horizon tasks.
They were also very good at high fidelity UI generation.
And they were producing very good working UI.
Sometimes thoughtful, sometimes really really good. And also very fast.
Now, I experienced this when I tried one of these models, tried to rewrite my my blog. I know people have used this in a more creative ways, but I just tried, you know, a single prompt, rewrite my blog. And then it did this, which I didn't ask for.
It created a nice nice search box with a blur animation, with accessibility out of the box.
And then that's when I realized that in the space of of 3 years from when ChatGPT was released to today, we went from you know, few lines of code is great.
It can it runs. Oh.
And now it can write better front-end code than me.
And you know, I I don't mind. No no ego.
Uh It's just reality. So, here's the question. If these models are so good at writing UI code, why are we still stuck in this mainly old paradigm of mostly static UI?
And where is where is that Jarvis moment that we've been talking about earlier?
Where are my floating UI windows that appear and disappear? And why we not there yet?
So, my name is Ruben Casas. I am a staff engineer at Postman. And I've been looking at UI and generative UI for the past year, and I've been working with MCP apps as well.
And today I want to show you what we're doing today and where we're going in the future.
So, the news are we have a new computer.
And as Andrej Karpathy put it, interacting with this new computer is like talking to the terminal. You have direct access to this operating system.
And the GUI has not been invented yet.
It's like we are in the '70s, where everything was just text.
And we have a super intelligence, but we don't have a mature interface language.
And today we are still trying to figure out what is this new interface for for this computer.
And people ask, is it chat?
I'll show you what we're what we're doing today. And actually this was a very recent tweet last week, where people were complaining that most SaaS companies have been adding chat to their their homepages, and everybody's just putting chat everywhere.
And and that's that's fine. I don't have a problem with chat. It's not the final UI. It's okay for now.
But the question is, if it's not chat, then then what is the interface for this computer?
On the other hand, as we have seen with MCP apps, is there [clears throat] is another thought, which is we will have one app or super app to rule them all. And this is where MCP apps comes in, where instead of putting all of these chat windows into your homepages and and to every single app that you use, we will have a super app like ChatGPT or Claude or Gemini, where you will be interfacing with most of the the UI and the websites that we have today.
And this is good. This is the way we using MCP apps today to to render third-party UI inside one agent environment.
Now, these two options both could be valid. And and I believe these are part of the evolution towards finding out what is that new interface for that computer. And to be honest, I don't know which one is going to be the the final one. Consumers will tell us.
But one thing is that these are two different questions.
The question is where does the UI runs? In this case, is it third-party UI, a super app, or in this case, chat everywhere?
But most interesting is what is the model generating. And this is what I want to talk about today in terms of how are we generating this UI? And we have seen this. We have mostly static, declarative, and generative UI.
And I'm going to describe briefly this one. So, we have um to start with the the static components way of running running UI, which is what most agents do today.
The agent is just an orchestrator. The agent makes a tool call via MCP apps or direct agent tool call. Then we will have some parameters and data passed to predefined static components that have been created by developers. And this is very similar to what what we have been doing for the past 20 years with with UI. And then the client renders the component.
And if you see here, it's very similar to just getting a server to send some data, and then the UI will be rendered by the client. But in this case, the agent will be generating that data and the props to to do this. And some examples that we have today are the AGUI protocol. They have an SDK when you can register a client tool that maps to a React component. The tool call will receive some props. Those props will be mapped to a static component that then will be rendered to to the user.
Another example is Goose.
Goose is a an MCP client where you can try most of the MCP features. And Goose has this really interesting feature called Goose Auto Visualizer, where you can just pass any type of data to Goose, and Goose will try to match that data, organize it, and then pass it to a set of predefined components that the Goose team have created. In this case, we have a few interesting components that you can use to visualize your data.
So, that's the static way. That's the most common way of generating UI today.
But I have seen an evolution recently where we call now declarative UI.
And declarative UI takes it to the next level. So, we will still have some predefined static components that developers build, and it contains your design system and all these components that you have.
But instead of the agent just passing the props and the data, the agent uses a descriptor that could be either JSON or YAML. Or I've seen Python as well with with fast MCPs, where they have a a descriptor in Python that maps to these predefined static components.
And then you have this translation rendering engine that takes those descriptors and converts them into the final UI.
How is this different? Well, in this case, it is more dynamic. They are still static components, but it's more personalized.
And you if you look at this and you think that this might look familiar as well, it's because it is not new.
Netflix has been doing this for a long time since the the personalization and server during UI era, where when you go to the Netflix homepage, you will get a UI that is completely personalized to you. But that's still mapped to the Netflix components and UI elements.
Another very good um tool that I've seen recently is JSON Render. JSON Render is being built by Vercel. And it is a way to map your components using JSON and also YAML.
They released the YAML support recently.
And create all of these very dynamic, very good UI interactions that you can use today.
But now JSON Render still, they say, is constrained to your static components.
And yes, they're still static components. The LLM is not generating these components. The LLM is generating the JSON.
However, I think in at this point in time, declarative generative UI is probably the perfect balance today in terms of flexibility and consistency.
Because you would still want your design system. You still want to have uh, predictability of what the UI is going to be generated, also faster and also potentially cheaper at this point. Uh, so you don't create and use a lot of tokens to create the UI.
But as I mentioned at the beginning, why why are we still stuck here?
And what's the next level? I think the next level will be uh, generative components.
And generative components uh, goes into like the the premise I I put at the beginning where the models are good at writing front-end code. They are good at writing React. They're good at React creating in JavaScript, CSS.
And the question is why we don't let them just write that on demand at runtime.
What could possibly go wrong with that, right?
This model um, of generating the UI uses the agent capabilities. And in this case, you can also use a tool call, but instead of calling this uh, layout rendering engine, you can call the same model with reverse sampling or you can call another model that will generate the HTML, CSS, JavaScript on demand and then it will be passed to the client.
I did this experiment um, I work at Postman on this experiment where I created uh, this weather agent that goes to the API, the weather API. It creates a joke. It creates the HTML, CSS, JavaScript, all in one tool call.
And you get presented with this random but very uh, imaginative UI where everything is created by the agent.
There is no component. There is no translation.
So there is of course a problem with this approach. Uh, and the problem with this approach is uh, if we don't trust third-party code, well, we should not trust um, code that has been generated by LLMs and then just present it to the user.
Uh, generative UI and and this level of generative UI needs a distribution model.
And this distribution model requires a boundary, requires containment and requires a sandbox, which is what we were talking about earlier.
This is what I think MCP apps matter a lot because MCP apps are the best uh, delivery mechanism uh, for generative UI.
We have the features provided by MCP, including authentication and tool calling and message passing between the UI and the agent. Uh, it's sandboxed by default with that double iFrame. Is the default for third-party UI delivery today. Does this become the standard?
And one interesting thing is it's not just for for third-party UI. It can also be used for first-party UI.
And this is why I think what Anthropic is doing with the the visualizer feature is very interesting strategically speaking because they could have just created their own rendering um, and an architecture mechanism for delivering this in interaction in in cloud, but they decided to go with MCP apps because MCP apps provide most of those um, features that I mentioned earlier uh, out of the box.
So if Anthropic decided to use MCP apps for their first-party UI, uh, you can ask yourselves why cannot we do the same? It is um, a very very strong protocol. And especially when the UI is being generated on the fly by the agents, by the the code uh, coding models, then is the best um, mechanism for delivery.
Now, today is probably not the final form.
And people keep saying is chat the final form? Is MCP apps the final form?
We're still we're still trying to figure this out.
And the obvious future is probably too obvious.
And we said about where is my Jarvis?
Where is my floating windows?
And if you think about it, that's the obvious things thing that people would think how we would look like if we were generating or creating a new uh, user interaction.
But what I think is we don't have enough imagination yet.
And and this analogy I heard recently is very interesting when when uh, radio came out in the 30s, um, the the the first um, sorry, the the TV came out, the first uh, TV shows were radio shows with cameras because they could not imagine what you could do with this new technology. So this new technology that we have today is very similar when television came out and we are still in the radio era where we don't know all the amazing things that we will do in the future with this new media, with this new power that we have with the with the new computer.
And we can see that we cannot even imagine what it's going to look like.
Uh, of course this is a speculative, but what do I think is actually going to happen is we are going to be moving uh, beyond components and more towards a collaboration uh, through human agent collaboration.
If you haven't heard about the Excalidraw MCP app, uh, definitely check it out because the Excalidraw MCP app is not just for um, output and visualization of diagrams.
The Excalidraw MCP app does something very interesting, which it creates a a shared artifact.
It creates a canvas where a human and an agent can collaborate together into a shared space where you can go back and forth with the agent and ask, you know, change this, but you can also click around, modify the UI the way that you are used to. And that becomes the new way of interacting, the new way of experiencing the the agent um, powers.
And and at the moment again, we are very constrained to our imagination. And and I believe these agents are very very powerful for just to just use them as a orchestrator and a delivery mechanism to show me some visualizations.
So I believe beyond components, it will be the future of generative UI will be more a collaborative experience where yes, we will have some generative UI, but that UI is going to be super personalized and it's going to be collaborative.
So we we are still early.
Um, we don't have the answer. People say, you know, what is what is the future of the user interaction, the user interfaces? We don't know yet.
But we can shape that future uh, and create this uh, new computer.
And that's me. Thank you so much uh, for listening to this.
>> [applause] >> You can find me and ask any questions.
Thank you.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
Re: 🗣️📍theprophedu📍2026 GST 103 CLASS (E-EXAM REVISION)
theprophedu
636 views•2026-06-04
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29











