AI agents require four distinct types of memoryβworking memory (LLM context window), episodic memory (complete interaction history), entity memory (structured knowledge about people, products, and relationships), and procedural memory (learned patterns and preferences)βto achieve true intelligence, as most teams only implement working memory through prompts and miss approximately 75% of what makes agents intelligent. Traditional databases fail to support these memory needs because they require managing multiple persistence layers (Postgres, Redis, vector databases, object storage) and creating complex ETL pipelines between OLAP and OLTP systems, which introduces latency and consistency gaps. Lakehouse solves this by bringing managed serverless Postgres directly into the lakehouse architecture, providing instant zero-copy branching for safe experimentation, real-time data synchronization, and auto-scaling that handles viral workloads without requiring manual infrastructure management.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
AI Agents That Remember: Building Stateful Systems with LakebaseAdded:
The challenge is [music] that most teams are building working memory through their LLM prompts. So, they store some basic data in a database and call it done, but without the four different types of memory working together, you're missing about 75% of what actually makes the agent intelligent. Your agent needs to pull um relevant episodic memories and enrich them with entity facts, [music] apply procedural knowledge, and synthesize working memory in milliseconds.
>> [music] >> Hello everybody. My name is Savannah Longoria. I'm currently a senior developer advocate here at Databricks.
Um I joined back in June as part of the Neon acquisition.
Uh before that, I was a solutions engineer and I've been working with databases for the past 10 years. So, you can say that I really love databases.
Um and today we're going to be talking about AI agents and specifically we're going to be talking about building stateful systems with Lakehouse. The first part of this talk is going to be a little bit more theory. Um but yeah, let's start with the big picture. So, AI is democratizing software development in unprecedented ways and we're seeing data scientists, analysts, and even some business users uh building applications that were previously uh the exclusive domain of engineering teams. And this shift is exciting, but it's also creating new challenges and developers are needing to create complex systems, agentic systems without necessarily having decades of infrastructure experience.
They need databases that just work and that leads us to why we're all here today. So, let's trace the evolution of how we got here. Um we started with LLM chatbots. Then we added rag, which is retrieval augmented generation, where we started supplementing the LLM with non-parametric knowledge and now the model can access your company's documentation, your data. And this kind of represents the shift that we're in today.
Um architectures can access um and process external dynamic knowledge in inference time.
Um and AI agents is the next uh evolution.
And LLMs use tool capabilities, um advanced reasoning and planning, and they also decide in what order. And today we're currently in the midst of a major shift um into agentic systems.
We have system architectures consisting of multiple tools, AI agents, and different components. But not all AI agents are created equal, and I like to think of the agentic capability spectrum as a spectrum across four levels. We have level one, which is essentially an LLM with instruction that runs in a loop to get the desired result. So you can think of it as simple as like a chatbot trying to understand your question.
Um controlled flows is level two.
Um this is where most most production systems are today.
And I see this a lot, so I help run the AI agentic program at Neon, and this is definitely where um level two and level three are where I'm seeing a lot of uh the customers that I talk to.
Level three is essentially routing in specialized workflows, LLMs that classify requests, um and route them to purpose-built workflows.
So they're making more context aware decisions about how work gets handled.
And level four is autonomous agents. Um I like to like kind of think of something like Waymo, um which you know is coming to Seattle.
Um and these are like LLMs that plan their own sequence of actions. Um they leverage tools independently, and this is kind of where we're at and where we're heading. Now that everyone is on the same page about the state of AI agents, uh let's talk about the fun stuff. At the core, there are four different components to an AI agent. We have perception, so that's how an agent understands its environment. Uh tools and actions, that's how the agent essentially decides what to do.
And we have planning.
Um or yeah, planning. And then memory is that's a big one that we're focusing on today. Uh memory is essentially what transforms a simple query system response system into something that can maintain context and learn from interactions and provide consistent personalized experiences. And the traditional pop problem is that LLMs are stateless by design. When you call LLM API, whether it's OpenAI or Anthropic or any other provider, it doesn't remember anything from previous calls. Each request is completely independent, and the model processes your input, generates output, and pretty much forgets everything without that persistence layer. This essentially creates a fundamental scaling problem in production. You typically run multiple replicas of your application across different ser- servers for reliability and performance.
You can't scale stateless agents across servers in a meaningful way. And this isn't a limitation that you can vibe code your way out of, you know, you need external persistence.
So, on the right here, we have um persistence, which is durable storage that survives API calls and server restarts. We also have relationships.
So, this is connections between users, conversations.
And memory is the like retrieval of storage and relevant context.
So, if we think about what this means in practice, we have consistency across sessions. So, like I said, it requires external persistence, which that's why we're talking about LakeBase. Um and trustworthy interactions uh require a mem- memory management system. So, not only the persistence layer itself, but a memory memory management system.
And um you also need accumulated expertise. So, it requires both persistent and management. You can't just store raw conversations. You're extracting learned preferences, tracking behavioral patterns, and building analogies knowledge of entities and relationships. And this essentially translates into like the user experience being a positive one. So, now that we understand the three-part formula for agent memory, let's break down the four different types of memory that production agents actually need. So, if I were to click into an architecture of a modern agent in 2026, this is essentially what you will see.
Um memory and AI agent breaks down into short-term and long-term. You also have perception, which is essentially the environment, tools and actions, and planning like we um showed in the previous slide. So, now considering the previous diagram, if we take a step further and just focus on memory, we have four different distinct types of memory. We have working memory, so that's our LLM's context window. It's fast, limited, and temporary. We have episodic memory, which is the complete interaction history with semantic and search capabilities.
We also have entity memory, so that's structured knowledge about people, products, relationships, and facts. Um this is kind of like some uh LLM knowing that the user 1 2 3 4 5 is vegetarian and prefers evening deliveries and orders extra napkins and yeah.
Uh then we have procedural memory, so that's the learned patterns um and preferences over time.
And the challenge is that most teams are building working memory through their LLM prompts.
So, they store some basic data in a database and call it done. But without the four different types of memory working together, you're missing about 75% of what actually makes the agent intelligent. And these memory types need to work together seamlessly. Your agent needs to pull um relevant episodic memories and enrich them with entity facts, apply procedural knowledge, and synthesize working memory in milliseconds.
So, if we look at this diagram, we can kind of visualize the complexity of all this you have to met um manage when you're building an AI agent. You have working memory, caches, knowledge base, conversation store, and they all need to work together. And that's kind of where we hit our first major problem.
It's that AI agents have memory needs and um infrastructure needs that traditional databases were not built for.
Traditional databases force you to cobble together um multiple management systems, manage synchronization between them, and build custom infrastructure just to support a basic agent functionality. You end up building a memory management system yourself across multiple databases. Um so, like what I what I've seen in the field is people have Postgres for structured data, Redis for caching, a separate vector database, object storage. And now you're managing essentially five different persistence layers, plus all the management uh memory management logic to orchestrate them.
So, it's really hard to scale. And that's why most production agents remain stateless or have very limited memory, not because teams don't understand the value, but because infrastructure complexity has is become prohibitive.
So, we talked about agent memory and why it's important in the context of managing persistence, but let's be specific about why OLAP and OLTP infrastructures uh fall short.
So, traditionally, that there's a gap between your OLAP and OLTP systems where data has to cross the system and network boundaries to move one to another and require custom pipelines and configurations making it essentially pretty difficult.
This is when teams end up building complex ETL pipelines between systems which itself introduces latency and consistency gaps that break agent memory. And the problem is that AI agents need both simultaneously. They need OLAP and they need OLTP. And traditional architectures required moving data between these silos is often through slow ETL pipelines.
We have resources that are provisioned and have capacity limits.
You can't predict agent workload patterns. One viral interaction could totally overwhelm your database.
Your operational data data is siloed in flexible architectures which we covered in the last slide. And the data platform integration requires fragile ungoverned ETL.
Moving data between systems is brittle and bypasses bypasses governance.
Agentic developer workflows also create safe safety hazards. New developers building agents can accidentally corrupt production data and expose super sensitive information.
And support for operational AI is bolted on. It's not built in when a lot of folks start their AI agents.
Vector search and embeddings and AI specific features often feel like afterthoughts.
And that is essentially the gap that LakeBase fills. So LakeBase serves as a operational and OLTP database which is necessary for operationalizing the data.
Feeds back into the agents and displays it in applications so providing low latency query and writing.
This essentially solves the external persistence part of our formula. So now let's actually look at how it enables sophisticated memory memory management systems. So, what is Lakehouse? With Databricks Lakehouse, transactional workloads are no longer a separate concern for analytics and AI. It is a new architecture for an OLTP database.
Lakehouse brings managed serverless Postgres directly into the Databricks Lakehouse, making OLTP a first-class citizen along Delta Lake, ML Flow, and Databricks apps. And if there's anything you take away from this talk someone asks you what Lakehouse is, I want you to take this way. It is actually Postgres. It's not a proprietary fork, Postgres like, or Postgres compatible. It's good old Postgres that you can download from the internet and run on your laptop.
However, there are some clever twists with Lakehouse that solve the exact problems that we talked about earlier today.
So, our architecture disaggregates compute and storage. And what I mean by that is that the customer data set doesn't reside on the host that makes Postgres run.
We have a clear separation between the host that runs Postgres and the software where the data set resides. This separation plus our serverless architecture is what enables features, which I'll touch on in the next slides and also when I do a live demo, that make it perfect for agent workloads. So, often times when I see folks launch, we see agent workloads burst.
Like I mentioned, there's viral moments.
But because Lakehouse utilizes separation of storage and compute, it's serverless by default. There's no sizing, patching, or scaling decisions.
And auto scaling handles surges and scales down when it's idle.
So, compute and storage that are decoupled also help you have cost control.
You're not paying for what you don't use. And you can like right size to any stick steady state. You can set your cluster to whatever I think it's 0.5 up to 32 right now.
Um and you can set that auto scaling range. You can also scale to zero um so you don't have to pay for what you use.
And let's talk about branching. This is my favorite feature.
Um so branching and development allows safe experimentation for AI agents.
So it gives you instant zero copy isolation. You can create a complete copy of your production database including agent memory, conversation history, user data in seconds without duplicating storage.
We also have a get style workflow for data. So this enables proper CI/CD workflows for your database. So you can create a feature branch, um test your agent changes against real data, and merge it back when you're ready. And if you need to test how your agent behaves um with different memory states, you can always spin up a branch, experiment, and then delete it when it's done. You don't duplicate storage. Um there's no complex cleanup, and this solves the data uh safety hazard that I was mentioning earlier.
Developers can experiment freely without uh risking their production agent memory. And this is a little sneak peek into branching. We'll uh show it more in the demo. Um and we have recovery and backups which I'll also touch in the demo. So snapshots create instant backups on a schedule uh from when the database can be restored instantly.
Uh we have an API that features create, delete, and restore for a snapshot and scheduling as well. We're going to be doing a demo on Casper's Kitchen which is essentially a ghost kitchen demo company that uses uh Data Bricks Lakebase and AI agents to opera- operationalize data and build internal uh tools. We're in the Data Bricks catalog, um and this is where all the Casper's Kitchen uh events are streamed.
So, we can see here that we have um internal stakes support.
And we're going to go over here to uh jobs or to serving endpoints.
So, what we're seeing here is our endpoint. Um so, this processes incoming support requests and returns a response uh including the amount of the refund and credits to apply. It also generates a response to give.
And we're going to go over to jobs and pipelines.
And we're going to open up our agent stream.
So, let's hope the demo gods are with me today.
Um so, what we're seeing here is our jobs and our pipelines. Um we're going to open up here and see our notebook.
And we could see our process and support request function. We have our OpenAI client, and we have our serving endpoints, which we showed in the last section.
And if we go down here somewhere, we have our process support request. Um so, this is the actual request. Um you can see that the raw support request is here.
And uh we can write the response of the agent and to support agent report table. All of this is in lakehouse. So, um here we have our support responses.
We can go in here.
We can see our sample data. You can see that we have our support ID request, our order ID, user ID, as well as the agent response.
So, this is actual um JSON. So, this is in the lakehouse and now we want to get it into lakebase. So, we're going to set up a sync table.
Um, and it's pretty straightforward. All you do is, uh, pick your destination, uh, your project. So, it's be that one and then your branch. So, you can utilize different branches.
Um, there's different types of sync modes and you can create that. I've actually already created it and I'm a little too scared to do it during a live demo.
So, let's actually go back in our lakebase instance and see that here.
So, we have our different, uh, database projects here. I'm going to go to Casper's kitchen support. Uh, we can see that we have our compute is 1 CU. It's active right now. We can also see that we have different branches that aren't being utilized, which are idle.
Um, you can set up in different regions and connecting is pretty straightforward. You can have your, uh, Postgres. We have different connection methods here.
Um, and you can connect it to different roles.
And you also get monitoring of your, um, database as well.
But, let's go back here, um, to my tables. And we can actually see, if we open up the support, we can this is our, uh, support agent report sync table. And we can actually see that it's already in lakebase, which is almost instantly. And we have our application here. This is in our Databricks app, so you can go back to your compute and then you hit apps and we're in our support, uh, console that I actually live coded.
And here we can see that there's state in the agent. So, we can see that we have, um, history. Um, we're opening the request from Cameron.
Uh, we can see that the raw the raw support message. We can see that they needed help with the recent order. Um and the agent said, "We'll look into this."
This isn't really a great response, and we can actually um use Lake Base and Lake House to regenerate that agent. So, we can actually advise it on how to improve its message, and it will take that into consideration and improve it.
We also have um ways to reply, and you can type refund and all that. We can also, if we wanted to, essentially add a branch. So, I'm going to actually create a new branch and show you how easy it is to do. So, let's do sub test.
We can uh choose when we want to automatically delete this branch, so you're not um paying for that extra cost. You can do uh including the data from a parent branch up to this moment, um as well as past data.
And we get our own connection string for that branch. It basically is a live coding with cursor, and we can actually do utilize branching um with cursor. So, you can see that I gave it a prompt, um and it actually calls the API and um uses branching to create a new branch for new security measures.
>> [music]
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsβ’2026-05-29
BREAKING: Microsoftβs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsβ’2026-06-03
Long-Running Agents β Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsβ’2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsβ’2026-05-28
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsβ’2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsβ’2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsβ’2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 viewsβ’2026-05-30











