Google's I/O 2026 keynote revealed massive infrastructure scale-up with 3.2 quadrillion tokens processed (7x growth from previous year), new Gemini models including the world model Gemini Omni and the faster Gemini 3.5 Flash, along with agentic products like Antigravity 2.0 and Gemini Spark, demonstrating how large-scale AI infrastructure enables practical applications from generative UI in search to scientific research tools.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Google I/O Keynote Breakdown: Everything newAdded:
The Google I/O 2026 keynote just finished, including a demo that used 93 sub-agents, 15K model requests, 2.6 billion tokens. Wow, that must have cost a fortune, right?
Well, we'll get to that, but I think this wasn't even the most impressive thing they've shared. I'm Stefan. I'm a developer relations engineer, so my job is to double around with SDKs, AI, and different models, and I watched the whole thing, and I have some thoughts.
Some of the things that have been shown are really impressive, but others left me wondering, "Okay, but why?"
So, let's take a look. Before we get into models and product, let's start with the numbers, or what I call the flex that only Google can do, because some of these numbers are just uncomprehensibly large. So, let's take a look. Last May, they were processing 9.7 trillion tokens, which sounds impressive already, but today, they're actually processing 3.2 quadrillion tokens.
That's roughly a 7x jump, and that's quadrillion with a Q, right? Which means they're now doing 19 billion tokens per minute. That's just crazy.
Well, they also have 8.5 million developers building on their models.
They had 375 customers each pushing it over a trillion tokens in the past year, so they have some heavy users.
And they've got 13 products with a billion users, and five of them even with over 3 billion. I mean, every other company can only dream of numbers like this. It's really incredibly large.
Also, their CapEx went from 31 billion in 2022 to 180 to 190 billion this year.
That's also high numbers, but I think this matters, because the infrastructure story there is real. They announced the new TPU 8T, where I think the T stands for training, which has three times the raw compute of the previous generation, and they're claiming that they can now distribute training across roughly a million chips using Jax and Pathways. And then they have also introduced the TPU 8I, where the I stands for inference, which should be hitting around 1,500 tokens per second on model that I will talk about shortly.
Overall, we can say that the scale that Google operates at is just incredibly hard to actually just grasp, but they must be doing something right. Now, let's get to new models, and for that Demis Hassabis walked out, and he opens with, and I want to really quote the energy here, "AGI is just a few years away."
That's just low promise, you know, very chill. To quickly stop here, Demis Hassabis is one of the most interesting people, I think, in the entire AI space.
They This is not a hot take. There are documentaries. There are Nobel Prizes awarded to him. But having listened to quite a few podcasts with him and seen interviews, I think this is a one of the people who are genuinely impressive to listen to, to follow, and to see the vision of where AI can actually take humanity. But overall, they announced two new big models. The first one is Gemini Omni.
And they're framing this as a world model, meaning it can take just any input and create any output, really.
They started with video, so text to video and conversational iterations. We can refine clips by just talking back and forth, and this is the fun one, take your own videos and ask it to change reality in them while maintaining certain aspects of the video. Now, the creativity ceiling of it is impressive.
And the demos looked phenomenal. But, like just to be honest, as a consumer, my reaction was, "I mean, that's nice, but like why would I really need this?" And I'm sure you can think of use cases that might make sense. But, overall, this just feels a little bit gimmicky and AI slow to publish to me for now. But, the promise of world models for sure is incredible. But, anyway, the Omni model is shipping across the apps today with the Omni Pro version coming soon. The other model that they introduced was the Gemini 3.5 Flash model, and they did a lot of trust me, bro benchmarks with it. So, I can confidently say this is a great model.
But, in all seriousness, it seems to be better than their previous Pro model in many tasks, including coding, reasoning, and what they call real-world tasks.
They also claim it's four times faster, but well, this is a claim that they need to back up with real usage. And we'll see you more of Gemini 3.5 Flash in the rest of the video, but they also announced that the Gemini 3.5 Pro model will be coming next month. Quick aside for something that I think deserves more attention, which is Synth I Synth ID Synth Synth Synth Synth ID. And this has now watermarked over 100 billion videos and images. They also added C2PA content credential verification, which is now live in the Gemini app today, and it's rolling out to search and Chrome. So, what this means is we'll be able to circle content, for example, in Chrome, and it will tell us whether this was AI manipulated in any form. What I think is interesting there is that the partner ecosystem is growing. So, they announced that Nvidia, OpenAI, Kakao, and Eleven Labs are also partnering with them. And I think this is a very good sign because I think deep fakes and similar things are becoming more of a problem. And I love to see that there is a conglomerate of companies actually targeting and challenging that. Next, they got to a product that had me actually leaning in since I'm a developer, and that's antigravity. And they announced that Gemini 3.5 Flash is now available in antigravity. And also, they shipped two new things, which is antigravity CLI and antigravity SDK.
And all of that comes with native voice support integrated. And all of this is globally available today. And there are also integrations with Android, Firebase, and Google AI Studio. But I think the biggest story here is antigravity 2.0, which they're calling agent-first.
And the agent harness got a massive overhaul, and this is important because this powers quite a few products in their lineup. They even said they had they have optimized the Gemini 3.5 Flash model to run 12 times faster in antigravity than elsewhere.
I'm not sure how that holds up, and also if it's optimized to run in antigravity, I wonder how well it works in other tools like Cursor or anything else. But then, they showed it in action, which is the demo that I talked about earlier, and they asked each to build an operating system. And as I mentioned, they had 93 sub-agents spawned, 15,000 model requests, 2.6 billion tokens. But the interesting part there is that all of this just took under a thousand dollars in API costs to build.
It apparently wasn't fully functional, but they live coded Doom into it which cool, I think. But the overall UI update is showing that all of these tools are somewhat converging because it looks quite similar to I think what Codex pioneered and other tools such as cursor are looking quite similar.
If you look at the left side bar here with the agents on the left and then we have the artifacts that are generated on the right. In the case of Antigravity, it's things like graphs and then open on the images and other things. So my honest take is I think it's fine. I think the overall UI works quite well at the moment, but I'm very curious to see if this is the final form of these coding tools. I think there might be more to come here. Next up is a new agentic Gemini feature called Gemini Spark and is claimed to be a personal AI agent that runs on dedicated virtual machines and it is able to work 24/7. So even if you close your laptop or lock your phone, it will still continue to work.
It is also powered by Gemini 3.5 Flash and the Antigravity harness. So the underlying infrastructure that Antigravity uses. It plugs into Google's ecosystem natively, but the interesting part is that it can also integrate other tools with MCP, which is only coming later unfortunately. The demos that they showed were standard agent demo sets. So they draft an email, they pull contacts from docs, mail and chat and you can plan a party and Spark generates things like a life RSVP tracker in sheets, and it sends follow-up emails, it builds a slide deck. I'm going over this quickly, but I think it shows that it's very well integrated into the Google ecosystem, so it can pull information from anywhere, and it can even export and create new artifacts such as Google Docs, Slides, and other things. So, I think it's genuinely impressive and really useful if you are in the Google ecosystem. What I actually like here is the architectural design of it. So, we have dedicated VMs per agent, we have syncing across devices, we have voice interaction, we have the ability to spawn multiple agents from a single request. So, I think they have a really good shape for all of this, and if it actually works as promised, it can be a really helpful assistant for you. Next up was search, and I think this is seeing probably the most interesting shift in the product. So, they talked about AI mode, which is now powered by the Gemini 3.5 Flash model. It already has a billion monthly users, and it doubles the queries every single month since launch. They now redesigned the search box to be called intelligent search box, which is creative naming, I guess. But, what it gives us is that it expands once you input something, it gives you AI suggestions to a complete the query, and we have things like voice interaction and all kinds of multimodality baked right in. Another addition are information agents, which are dropping this summer, and the pitch here is that it's the era of search agents. And this means that these are agents that are running in parallel alongside Spark, and they're breaking down tasks, determining the urgency for them, and they're even setting up triggers. And one example could be that you ask them to keep me updated about big biotech stocks.
And then this is no longer single query, but actually a standing job that reminds you and sends you notifications once interesting things happen. But I think one part that might actually be quite disruptive is called agentic coding inside search. And what this means is that search can now build UI for your answer using Gemini 3.5 flash with the intergravity harness again. So, if you ask, for example, an astrophysics question, it gives you a dynamic visualization with adjustable parameters, which is generated on the fly.
And I think this is incredible. It is generative UI in search. It's free for everyone, and it's coming this summer.
And just to be clear, like this includes generating mini apps, such as meal planners, fitness trackers, wedding planners.
And the thing is that this happens proactively based on the context. And you can even share these mini apps with other people. As a developer, I think this sounds incredibly cool, and it's very interesting how they approached and built that. I'm just wondering what this means for surfacing content from actual content creators, and what this means for their impressions, clicks, and interactions with their audience. So, I'm very curious to see how the new way of surfacing this information will actually play out in the end. Then very quickly on commerce, they introduced a few new protocols. The first one was called universal Commerce Protocol or UCP, which is basically HTTP for agentic commerce, which is now an open standard.
They also talked about AP2, the agent payments protocol, which is coming to Spark soon. And they even talked about some guardrails and accountability that they have built in there.
There's also universal card, which means we have one card across merchant, and it can even pull in context and give you information once discounts happen and similar. So if you're building anything commerce adjacent, then you should definitely bookmark these protocols because they might become very important in the future. Next up was the Gemini app, and it's already serving 900 million monthly users across 230 countries and 70 languages, which is quite a lot. And just today, they are shipping a complete redesign, and they're branding this Neural Expressive, which means they have fluid animations, regional dialects, which is quite cool, I think. And most importantly, they don't want to show you a wall of text, but responses are now laid out in real time with images, timelines, tables, embedded videos, and much, much more.
Gemini Omni is also available in the Gemini app today for paid subscribers.
And the Mac OS app that they shipped last month, which actually is quite good, they will see some updates this summer, which embeds it even tighter into the ecosystem there. So for example, you can select files and just speak to Gemini, and it will pull in context from these files. And you can do that by just using your voice, which I think is quite cool.
They also introduced new updates to three other products. One is Google Pix, which is an image creation and editing tool, which has Synth Synth ID, Jesus, integrated, and it's coming this summer.
Then they announced updates to Stitch, which gives you UI design at scale, and you can have real-time collaboration, including voice, and it exports to code and other things. So, if you like building interfaces, you can definitely have a look at this one.
And the last one is Google Flow, which is their creative suite, and Omni plus agents plus tools plus music remixing is their claim to give you the most creative freedom there. Finally, they talked about XR, and they announced audio glasses coming this fall. And the way they work is that they have a camera, so they can look at your surroundings, and then they will interact with you using spoken language into your ear that only you can hear. It also has all the access to the context from the other products, such as mail, maps, YouTube, and others. And they're partnering with Gentle Monster and Warby Parker, and working together with Samsung to build these.
And one interesting thing is that they are pairing with both Android and iOS, which is quite interesting with rumors happening that Apple is also working on smart glasses. So, I think launching with support for iOS is actually quite a smart move from Google here. Lastly, Demis Hassabis came back on stage, and did I mention that I'm a fan of him?
Probably yes, right? But what he talked about was AI for science, and they just reminded us that AI is really helping science evolve, and I think this is a very cool and impressive thing, and I'm glad that they mentioned it. So, for example, there was Code Mentor, which is used for cybersecurity.
There's Gemini for science, which helps with literature insights, computational discovery, and hypothesis generation.
Then they talked about Alpha Earth Foundations, which is basically a digital twin of our planet. And one very impressive thing was weather next, which is used for things like hurricane prediction. And amongst all the products and models that they introduced, I think it's quite cool to see that there's actually value for humans and saving human lives with the help of AI. And I think this is something that's definitely worth mentioning from them. They also mentioned AlphaFold and AlphaGenome, which is continuing, and Isomorphic Labs, which is using AI to reimagine drug discovery. And yes, this is marketing, I know, but these are also the use cases, which to me make the whole AI scale-up story actually defensible. I think that is the part of the keynote that I want to believe the most in. So, let's get to my personal TLDR or summary here.
I think the infrastructure scale-up is real, and I think it's quite impressive to see that all the things they're shipping they're actually shipping at this scale. And the gravity is, I think, the announcement that is probably most impressive because it powers a lot of the products, but in a more underlying and hidden way. So, I think I'm really curious to see how well their agent harness that they have praised quite a bit actually works in the end. I think the Gemini 3.5 flash model can be a workhorse for many tasks if it holds up to the promises that they made regarding performance and speed. And to me, generative UI in search is actually a very interesting strategic shift to observe, because I think this might have quite an impact on the overall content generation community. Gemini Omni is dazzling and cool, but I'm still looking for the killer use case for this specifically.
But I'm sure it will come in the future.
So, what I now want to know from you is which announcement got you most excited, which of the products are you actually going to be using and jumping on just this week because it's so exciting and you just want to try them out. Now, please drop it in the comments. I will read every single one of them. If this video was useful, consider giving it a like and hitting the subscribe button on the channel. I will do more videos where I will dive into things more deeply and also react to other events happening in the near future. But for now, thanks for watching and I'll see you in the next one. Bye-bye.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsβ’2026-05-29
Long-Running Agents β Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsβ’2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K viewsβ’2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsβ’2026-05-28
BREAKING: Microsoftβs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsβ’2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsβ’2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsβ’2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsβ’2026-05-29











