The agentic data cloud represents a fundamental shift from traditional data platforms that provide insights to systems that enable AI agents to take autonomous action on enterprise data. This transformation requires combining structured and unstructured data with contextual understanding, where AI agents can reason over data, access real-time information through knowledge catalogs, and execute actions across operational systems. Key enabling technologies include the Knowledge Catalog for semantic data enrichment, the Data Agent Kit for agent capabilities, and cross-cloud lakehouse architectures using open standards like Apache Iceberg. The transition is driven by the need to move from human-scale data processes to agent-scale operations, where swarms of agents can complete complex tasks in minutes that previously required weeks of human effort, while Google Cloud's vertically integrated infrastructure optimizes performance and cost through innovations like TPU v8 hardware and BigQuery's 35% speed improvement with 40% cost reduction.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
From systems of intelligence to systems of action: Yasmeen Ahmad on the agentic data cloudAdded:
Hey everyone, welcome back. I'm Stephanie Wong and I'm super excited because right now I have Yasmin Ahmed who is the managing director of data cloud here at Google Cloud. Thank you so much for joining us. I'm excited to be here Stephanie. Amazing. Okay. All things data cloud. There's been a lot coming out now and so I want to talk about what's changing. Um so we keep hearing that the system of intelligence really is changing and evolving into a system of action, right? And so can you explain how the agentic data cloud is fundamentally changing the way that our customers think about their own data strategies?
>> So 100% we are seeing a rapid shift. So if I think about the last decade of building data platforms we built them as systems of intelligence whether it was a dashboard or report giving you an insight on a KPI or even a sophisticated data science model would give you a great predictive score. But actually in the real world what you found is a lot of those data insights got left on the shelf. Maybe 10 20% of them were made into production. Um so they would get into action in the business. But getting that action step that productionalization was always very hard. What we see with generative AI and in particular now these agentic systems is driving action is much easier. So we're fundamentally building up this um agentic data cloud that supports getting from not just intelligent geni but through to true action. So as we see customers adopting a system of action, they are looking not just to be storing data. They want data to be active in the reasoning loop live real time and then through MCP tools, through skills actually driving action into ledgers, into operational systems, into marketing systems. That's where true ROI comes to life.
>> Yeah, agreed. And it's such an exciting time right now. I just gave a talk and we were just again talking about the same thing. um AI systems now can actually take action on behalf and but it's still fundamental to have a strong data strategy right machine readable structured real time data that your AI agents can act on. So now it seems like we're at this pivotal point where they can right yes and that data strategy is absolutely critical. So if I just reflect on our journey over the last two three years at Google building uh data platforms and data agents we've learned a lot. In fact I would say this whole industry's strategy to AI ready data was quite naive 3 years ago. It was all focused on do we have clean data? Do we have lineage over our data? Is there good data quality? But actually just focusing on the data layer only got you to 50% accuracy with agents. The rest of the 50% comes from great context. So when I think about it, I reflect on the data science teams I used to lead lead in AMIA. My best data scientists weren't the ones who could write the maths or the algorithms the best. It was the data scientists who would speak to the business users. They would get into the supply chain and really understand how does the business work? What does does data actually mean? When we're looking at a PDF file, what's that hidden code that's on page 10? That context was never built into data platforms. That context was what I call invisible work that was outside the data platform in the human mind. And so today, when we're thinking about data strategy, the data strategy has to combine, yes, having really good solid data. That data today has to be structured and unstructured data and its context. It's the hidden meaning. It's the business intuition that needs to be coded so an agent can read that context and infer and reason over data accurately.
>> Would you say that one of the things that's enabling this transition is the fact that there is a semantic understanding of data now that AI agents can actually use contextual understanding. So you don't have to fill in every gap, but there is a certain level of inferred understanding of your data.
>> And that's critical is the inferred understanding. You know, if I take the traditional world of governance, if even if it was column names, description names, row row descriptions, business glosseries, they were all human coded.
It was a human who was spending tedious amounts of time filling in descriptions which frankly weren't necessarily that great because as humans, we don't like doing those jobs. Now, if you take the power of Geni and you give Geni a sample table, a sample data set, it will infer a lot of that descriptive information much more accurately than a human did.
But we're also taking it a step further when we're thinking about in the knowledge catalog this aggregated data and enrichment. The enrichment is just column names, table name, descriptions.
If we take unstructured data, PDF files, unstructured data de facto doesn't have a schema. And if you give a GEI one PDF document, it'll reason fairly well. Two PDFs, sure, but actually in an enterprise, you have thousands of these documents. You physically can't pit fit a thousand documents into the context window of a model. But even if you could, it would be exponentially expensive. What you need to do is create that inferred schema, that inferred meaning across that unstructured data.
And that's what the knowledge catalog does. It creates that inferred descriptions, inferred meaning, inferred schema relationships, and to and to your point, an agent can now access that context, learn how to use that data, understand exactly which data it needs to leverage. So, not only is it higher trust, it's also lower cost and more efficient.
>> Right. Exactly. And you just touched on some of the capabilities from Google Cloud. So with Gemini Enterprise, which we just heard about in the keynote, it's acting as this new front door for data, right? So how are we enabling organizations to turn their existing, let's say, BigQuery and Looker assets into active, more helpful assistance for their employees?
>> Great question. When I think about Gemini Enterprise as the front door, 100%. It's that single entry place where a business user actually doesn't have to think about the complexities of data pipelines or data platforms under the covers. And so at next here we're introducing even more integration across the data cloud and Gemini Enterprise because a business user shouldn't have to worry about what the data platform specifics are. Gemini Enterprise is that front door. They want to chat with their business data. Well, yes you can. We are uh enabling um organizations to now create conversational agents in BigQuery, in all DB, in Lucer and publish them into Gemini Enterprise. So for a business user who comes to Gemini Enterprise, they just chat with their business agent. They don't worry about which data system it's in. Another really exciting integration that I I think is awesome is the deep research agent integration. So we've had deep research agent for a while. It does phenomenally well in investigating deeply web data, document data and giving you deep researched answers. What we've done is we've connected that deep research agent to a knowledge catalog.
The knowledge catalog knows about all the enterprise data and now a deep research agent can reason over enterprise data alongside web data alongside documents. So now you can get these really deep rich answers that are very precise and much more holistic. So an organization can be looking at web patterns uh weather traffic and connecting that with their shipping information that is in their data platforms and getting real time proactive strategies to optimize their shipping strategies. So the ability now to through Gemini and the deep research agent get to that level of insight it's in a matter of minutes, seconds and minutes which a business user traditionally would have had to go spend weeks with an IT team who would have stitched together all of this information and it definitely wouldn't have been real time. That's all available now.
>> Yeah. And I think this is a key unlock because for the past several years, we've been talking a lot about, you know, the foundation model and needing to fine-tune the foundation model to according to your own data sets, right?
But this is a layer of the stack that I think is is really powerful, the AI agent to do things like deep research against your own data and the web. It's a layer of the stack that you have more flexibility of um more control over as well and you can change on the fly. So it seems like we're just reaching more capabilities now with AI agents coming into the play to do things like function calling, rag, all these other things.
>> Absolutely. And we see with customers today, it's not just one agent, a monolithic agent. It's actually swarms of agents that activate to complete an intent. And so as a whole at Google, we see this shift towards intent driven engineering. Even us at Google when it was two years ago when we started building our first agents, they were persona based agents. It was a data science agent to help the data scientist or a data engineering agent to help the data engineer. Well, frankly, the models are just so good now. You don't have to tell them to be one fixed persona like in the human world. human world, humans typically become experts in one domain because we get very good at that one domain and we struggle to access multiple domains at the same time. Well, these models are are amazing if you give them the right tools and skills. They can actually do an end to end get the data, wrangle the data, find the right model, build a visualization, even build an application and deploy it. And so that shift in maturity of the models opens doors. And so as we think about intent driven engineering, what we see the future as is the data practitioners can focus on the objectives and outcomes instead of the tasks that have to be done. And we provide the agents with the right tools, the skills. That's why we launched the data agent kit here at Next because for us that data agent kit is the plugins, the extensions, the tools, the skills so that agent can understand natively Google's data cloud, build and optimize BigQuery pipeline, build a and fine-tune a Spark pipeline. These agents can be super powerful against Google's agentic data cloud.
>> Yeah, these tools and skills is like the actionbased intelligence that we're moving towards that you talked about.
So, it's awesome that we're coming out with these pre-built abilities for the agent to just take action on your existing data sets. I think the challenge though is that data still can be scattered across many places and environments. So, how does our crosscloud lakehouse support teams to do uh open standards like Apache iceberg um to ensure that customers aren't leaving any of their data clouds um behind in this new agentic era? This is a great question because I feel like every customer I talk to, they are multicloud whether they chose multicloud or they're multicloud because they're running a SAS application in a AWS and their data gravity is in Google. And so for us it's about embracing multicloud. So I think one of the challenges around multicloud has been there's been many vendors spoken about multicloud. It means they run on multiple clouds but you still have to choose a cloud and move wholesale to that cloud. So we really wanted to turn that around. For us we believe customers should be able to connect the data no matter where it lives. So crosscloud fundamentally is about reaching across clouds to AWS and Azure through our crosscloud interconnect or intelligent buffer and caching. So customers can leave that data where it is and just see it universally. But it's not just other clouds, it's also other data platforms.
So we can reach into data bricks with the Unity catalog, Snowflake, Polaris, AWS S3 Glue. And a critical piece of that is iceberg. So why couldn't we do this last year? Well, the challenge was every single system would have its own proprietary format of data. So anytime you wanted to do any federation across clouds or across data systems, you were building custom pipelines. And those custom pipelines had to understand that vendor's data format and how to ingest data in and out. Iceberg and open standards blew the door open. Now we have this open standard universal standard. We have the iceberg rest catalog. So if your data is in iceberg in AWSS3 in data bricks in bigquery now you can connect and see all of that data. So that's why we brought the crosscloud lake house to bear because we wanted that one single universal plane where users can see all of their data and they don't have to worry about where it's sitting. And the other big unlock I would say here is crosscloud interconnect. You know historically the big challenge about moving data across clouds was latency and egress. Well frankly with crosscloud interconnect you can get subsecond latencies and egress is not a big issue anymore. And so we have customers that are able to move a pabyte of data and it's not not a big challenge. So those those two things coming together, the crosscloud interconnect technology with the iceberg open standard allowed us to create the crosscloud lake cows and we're just so excited about what customers will be able to do. Yeah, it's uh it is an exciting time and you just touched on something that I want to dive into which is the cost, the performance, the efficiency, egress. As organizations move from human scale to now agent scale, cost and performance are going to continue to become very critical. So, how is our AI optimized infrastructure here at Google Cloud and our serverless approach? For example, what are we doing with BigQuery Spark? Um, how is this all helping customers scale their AI ambitions efficiently?
You're 100% right that as these agents come online, they are hungry and it's not just single agents, it's swarms of agents that we are seeing. In fact, I was speaking in a session earlier and I spoke about the stat that we're seeing in the industry where actually the web API gateways are seeing massive spikes in incoming traffic and that incoming traffic is not because a human has learned to click the mouse faster. It's because these agents are waking up these swarms and are um doing more calls than a human would. And so typically for one click of a human, you're seeing 10 to 20 API calls from agents because an agent will go into a multi-step reasoning loop. And that multi-step reasoning loop might have multiple iterations as it's hitting the web for information. So as as we see that scale up happening for us what's critical is you have to address the performance and cost complexity. You can't be your cost can't go up 10x because now agents are running 10x more more um inferencing or queries. So we're at the individual engine level we are super focused on making sure each engine is as efficient as possible. In fact, here at Nex, we're talking about how over the last year, we have made BigQuery 35% more uh the queries processing speeds have increased 3 improved 35%.
While we have reduced cost 40%. So amazing amazing things that our engineering teams are doing there in our Apache Spark world. Our managed service for Apache Spark now with the lightning engine is five times faster than just plain vanilla Apache Spark. Two times better price performance than the market proprietary alternative. So each engine is getting a boost. But beyond the engine's getting a boost, I think you mentioned something really critical. We see it as an entire stack because when an agent comes in and does a request, that request has to go through multiple levels of the stack including the data layer, including the model layer, right down to the infrastructure layer. And so for us, what's important is we actually optimize all parts of the stack. So today in BigQuery you will see 230x reduction in token usage when running AI inferencing over BigQuery data because of how we are integrating and making the stack super efficient. In addition, just this morning in the keynote, we announced um our next generation TPU and right at the infrastructure layer, we're also doing things like separating the training and the inferencing because on a single chip, the silicon, we can't have a traffic jam where Jenny is trying to read and write information and train and inference at the same time. So we're driving innovation at every layer of the stack to ensure as that request comes in from an agent, it moves up and down the stack seamlessly, high speed, and each layer of the stack actually works with the next layer. And I think that's the magic of Google. Only Google is working on infrastructure, the model innovation, the data innovation all together.
>> Yeah, truly the vertical integration.
Exactly. You're going to get optimizations that you can't anywhere else. So absolutely. So just going back to just the understanding that an agent is only as good as the data that it's grounded in, right? So how does the Aentic data cloud ensure that these agents are using the most accurate real-time context from across a company's entire data estate?
Agents having access to the right data and the right context at the right time is the critical piece. In fact, we see that if agents actually have too much context or too much data, they also get lost. So, one of the key pieces of innovation for us has been not just building a universal context engine with the knowledge catalog, but it's actually working on the search and serving layer.
So, as agents come in and make requests, we're serving up the right context and the right data. And so part of that has been actually us taking the hybrid search stack that was built for Google search and bringing innovations from from that Google search into now the semantic stack that we're building with the knowledge catalog. So that search stack can not just search for the right semantic information but actually it has complex re-ranking algorithms that ensure the right context is being ranked, prioritized and served back to agents. So the search and retrieval is just as important to us as is the context. And being at Google, we're lucky we can use all of the innovation from across Google Cloud and bring that together now for an agent serving stack.
>> I was going to say the same thing. It's another thing that we have. It's in our blood and we can bring that innovation right over. Right.
>> Super exciting. Um I guess my last question is what are you most excited about in the data cloud world in this new agentic era? I know you just touched on a lot, but just looking ahead, I'm super excited to see this system of of intelligence move to systems of action. And in particular here at Next, I have now heard three, four different use cases from customers just this morning of how they are engaging swarms of agents that are driving true action.
So working through getting the right data, the semantic context, but also connecting with the agent data kit to be able to take action across systems and do things that they just weren't able to do before. So I'm hearing about customers going from 45 minutes of what took a human process down to a minute and frankly unlocking ROI that that seemed impossible before. So it's those innovative use cases that I am super excited to see here at Next. And um I'm I'm excited that we have 80 data cloud sessions here, all hosting at least one customer talking about what they're doing.
>> Amazing. Well, there's no better time to be a part of this industry and just see all the actual ROI and impact that's happening right like actually today. So, I just want to thank you for taking the time to come on to our live stream, Yasm mean. Thank you.
>> Thank you. Thank you. See you everyone.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











