Agentic memory systems enable AI agents to maintain persistent context across sessions by storing and retrieving relevant information from external databases, solving the problem of context window limitations where agents lose track of important information. These systems use hybrid search combining exact-match keyword searches with vectorized semantic representations to recall relevant memories when contextually appropriate, allowing agents to remember directives, procedures, and facts without consuming token space in every conversation. The implementation involves background processes that continuously ingest and categorize memories, enabling agents to recall specific past decisions, API behaviors, and workflow patterns that would otherwise be lost between sessions.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Agentic Memory and GitLab 19.0Added:
Hello and welcome back to the developer show. I am Fatima here with my fantastic co-host Colleen.
>> Hey. Hey. I'm the aforementioned Colleen and thank you so much for joining us. If you're new here, this is our monthly live stream where we hang out with developers, talk about what's happening in the GitLab and the AI DevSec Ops world and dig into workflows, tools, and ideas that are shaping the way that we build.
Last month, we talked a lot about agents and skills. But this month, we wanted to stay on that thread, but go one layer deeper uh and cover what happens when agent workflows start to actually remember things and whether that memory is useful in practice or not.
Yeah. So, we'll cover agent memory.
We'll talk about highlights from GitLab 19.0. Uh, Colleen and I have some AI highlights from the industry that we want to share with you. Uh, and then we have a special guest Gre Gregory Havena who will take a deeper look at building out memory modules with open code workflows and persistent context and sort of his day-to-day developer experience. So, stick around for that because I think he has a presentation prepared and it's going to be really cool.
>> Yeah. But first thing first, we're going to venture a little bit outside of the world of GitLab and talk about some trends that we've been seeing all over in tech lately.
>> Absolutely. So, first up, one trend that I have been seeing, and it's not fair to say that it's new. I think it's been a couple of weeks now, is that people are asking LLMs to create HTML uh artifacts instead of markdown artifacts. And I really like it because HTML really allows you to click around, test, and visualize something as opposed to like a lot of text formatted in a really pretty way.
>> Yes. Uh it's way more fun to be able to play around with it. And I feel like more people are realizing that the output format really changes what you're able to do with it. Uh because if you ask for markdown, you're getting something to read, but if you ask for HTML, you get something that you can actually interact with.
Yeah, exactly. Like you can AB test something. So if you wanted to figure out if this was a better idea or this was a better idea, you can ask any LLM that you're using. If you're using Claude to build you the A version and the B version, then click around and see what those different versions feel like like tangibly. And so I think like implementation plan.md was like yesterday and like today is now like implementation.html.
>> Yes. It's super fun because you get to have more of a creative dev role there.
Uh, and it's really enjoyable.
>> Yeah, it's the the year of personalized dashboards. Uh, the second thing, Colleen, I was excited about is I saw a bunch of threads on X about how people were saying that open models are bad at tool calling. Uh, and there's all these debates now about like is it that they're bad at tool calling or is it more of a harness issue and how you've sort of infrastructured the harness as opposed to like the model itself and and how good or bad it is at reasoning.
>> Yeah, because a lot of the failures aren't even these huge reasoning breakdowns. They're based on really small formatting problems. So like for example sending null instead of omitting a field stringifying an array uh or passing a string where an array was expected.
>> And it's kind of funny because in a lot of these cases like the model is quite close to what it was actually supposed to be intended to do. Uh but what's happening is that you know the developer might have put in a very strict contract on the harness and that's creating this sort of mistakes here and there.
>> Exactly. And the thread that you sent me, uh, the point that it made was basically if the same like four failures keep repeating, then maybe it's not that the model can't use tools, maybe the answer is your harness needs a repair layer and also you need to fix the error handling.
>> Yeah. Yeah. The biggest takeaway for me was that I think a lot of people are quick to say, ah, the model quality is really poor. But I really think it's what you're saying where it's like your system design is quite poor. And if you design more sort of catch alls in your system with your harness and your contracts, your prayer logic, your default, your error handling, then you might find that the experience goes from being really frustrating to being really smart with the same model. Uh essentially.
>> Yes.
And the third highlight that we're going to cover uh or that I want to talk about is something that I've seen a little bit online, but I really experienced a ton of last week when I was at Open Source Summit in Minnesota. And that is what do we actually do about giant agent offered or authored pull requests? Like you get these huge huge huge pull requests and what are open-source maintainers supposed to do with it? Yeah, I was just looking up Roselle had an article on uh X about this and there was a great title to it and and it was something like agent PRs are not the death of PRs or something like that. Um but I agree with you. I think maintainers are getting really slammed with these like huge PRs that are generated and I saw some projects are like we're not taking agentic PRs anymore. Um and so it's been interesting to see where people land and sort of what the for and against arguments are because I do want it to be accessible for people to contribute. Um, but I also feel a lot of empathy as a former maintainer that like dealing with just like 300 Neopole requests that are all sort of doing the same thing is quite tricky, right? And I don't think that the framing needs to be AI bad, ban it. I think that it's the hey, we need to figure out a way that works for our workflow because a lot of the maintainers are usually volunteers and they can only do so much. We want to make sure that the code is still held to the same standard and it's still possible for people to maintain.
>> Yeah, absolutely.
>> Yeah. If contributors can ship it really really fast, but the reviewers need to have uh an extra 24 hours added to their day to be able to get anything done, then that is so painful.
>> Yeah. Yeah. Um so I hear you. I think a lot of the things that we've pulled together today are sort of like your system design needs work, your PR contribution flow design needs work. And I and I think that's, you know, a part of the larger story. Yeah, >> we well this is still so new and that's kind of the amazing thing is everything is still evolving. So we're at the point where it can be this thing needs work and then you put in the work, you make it great instead of oh well this has just always been annoying. No, we are at the inflection point and that means we're at the best uh point to make the difference in it.
>> Yeah. Uh speaking of the inflection point, uh Jishnu has a comment in the chat that I really like. Uh he's pointing out to uh Carpathy's LLM wiki model of having like a second brain system. Uh and so speaking of things and system design, uh this is something that I was testing out on the weekend is also a highlight. Uh my oversight for not bringing it in. Thank you, JustNew. Um, I think having a second brain now is a very viable thing because you can have an agent, you know, depending on how many tokens you have access to just running on your Obsidian uh like database and sort of organizing things for you, putting things in the right place, running on your to-do list, running on your calendar and your email.
Um, and so these have become really viable uh as pieces of things that you can now manufacture. Is that the word?
>> Construct. I mean, those are synonyms.
construct and interact with. And so I find that like instead of organizing my second brain uh in a specific format and spending a lot of time on what's most productive, I find that like I just spend time on inputting things and I let the agents organize how best to surface and organize those things, which is just like the surface level of Carpathi's LLM brain. But we're getting there. See, I know what you're talking about with second brain, but I cannot get the mental picture out of my head from young Frankenstein with just the brain on the table. Uh, so that is what every time you say second brain, I'm just thinking of a mad scientist with a brain on the table next to him.
>> Yeah, we had another comment come in uh from Stfan on LinkedIn. Uh, misunderstanding uses for markdown and markup is the trend then. Yeah, absolutely. I think a lot of people rely on markdown right now and um they're starting to learn that perhaps there are other ways to visualize information that are a little bit easier uh if you're working with a lot of you know like multiple things at multiple times and so agreed.
>> Uh so that's all we have for our highlights. I think one thing that I'm looking at that I think you know if our audience has comments on and and we can talk about next month is like how does the internet start to change as we start to formulate and construct things around agent UX and this is something that's top of mind for me. So like you know are we going to change the ways that we have websites? Are we going to allow agents to crawl our websites in certain ways?
Um and so yeah.
>> Yeah. How does the internet or how will the internet look in a few years? And if anyone watching has any questions or topics that they really really want us to discuss next time, drop them in the chat or you can also find us on LinkedIn and send us some ideas. We always love to hear from you.
>> But moving on, before we go deeper on memory and workflows, let's zoom on out and look at a few GitLab 19.0 highlights.
>> That's right. Uh so the first one that I want to talk about is custom review instructions for GitLab Duo. So originally we used to have the instructions at project level. Uh but now we've released consistency where you can have shared review instructions across a group and its subgroups. And so a lot of our customers will now be able to standardize these review workflows across the entire group or subgroup which is really exciting.
>> Oh, I should maybe pull up I should pull up the release notes, shouldn't I?
I'll pull that up while you give us your highlight.
>> Yes, that one does feel very very practical though. Uh and on the planning side, configur configurable or I know how to talk. Uh configurable work item types are really really nice too. Uh with those teams aren't limited to just issues and tasks anymore. You can create or rename types like user story, bug or maintenance and pair those with custom fields and life cycle support. So it's a lot more flexible if you want GitLab to reflect how your team actually works.
>> Yeah, absolutely. Uh we also have some changes with Duo in 19.0. Uh Agentic Core is now moving to usagebased billing finally. Uh and so Duo core users are also going to instead of having the old chat uh they now get access to the agentic chat that runs on do agent platform. So you'll also get to experience the cool ways that it'll open issues for you and create MRS. Very exciting.
Yes. And for teams thinking about secure delivery workflows, GitLab Secrets Manager is now available in open beta for premium and ultimate customers on gitlab.com and GitLab self-managed. So what that means Uh that means project and group owners can store, retrieve and reference CI/CD secrets directly in GitLab with access scope to jobs that explicitly request them.
>> I feel like we take feedback really well from our security engineer users.
Next up, some model updates. Claude Opus 4.7 is now available on DAB. Uh this is really great for those really long complex multi-step tasks. for example, preparing scripts for multiple demo features that are still in beta. I find Opus is really great for managing things like that and then managing like multiple issues in an epic and sort of tracking those things. And so I'm excited for everyone to get to use that with Duogentic platform.
>> Yes. And if you're self-hosting, GitLab Duo agent platform now supports uh some extra open source models as well, including Devspril 2 uh 123b uh GLM 5.1- FBA and others.
>> Yeah, I left that one for you just to make it difficult for you.
>> You I feel like they wrote those specifically to trip me up because that is so fun to read aloud. Oh, so fun.
>> And then for the nicest part of it, uh, we have our notable contributor of the month.
>> Drum roll, please.
>> I still need I need sound effects. Um, Norman Dval, the level three contributor. More than 40 margin improvements across GitLab since joining in 2022.
>> Huge thank you to Norman. His contributions span CI/CD, security, Duo, service desk, and core user experience, and uh they clearly come from real world experience. He's making GitLab better for everyone.
>> Thank you so much, Norman. So, overall 19.0, a nice mix of workflow improvements, planning, flexibility, some platform changes, and model updates. Uh, and so that was really exciting. Thanks for that deep dive.
Okay, I'm gonna stop sharing. I know that Dishnu Dishu is now on X as well sharing links in the chat to all of the things that we just discussed. Uh so if you're looking for any of those uh 19.0 feature release information, you can find it both on LinkedIn and X if you scroll into the chats. Um thank you Jishnu so much. MVP >> as always, we love you. You're the best.
>> The dev show MVP.
Okay, now for our featured presentation.
We have a guest from GitLab who's been working on some really cool Agentic workflows. Uh Gregory Havena, staff backend engineer in security infrastructure. Uh previously also led the vulnerabilities across context project. Gregory, thanks so much for for agreeing to be here after our donut in the I want to talk to your agent channel.
>> It's an absolute pleasure. I mean, it's what a what a delight to have a conversation about absolutely everything except the thing you were supposed to be talking about.
>> Story of my life.
>> Yes.
>> Um, by the way, >> is also part of our job.
>> Maybe not yours.
>> True.
>> Just a correction. I'm still working on that vulnerability across.
>> You're still working on the vulnerabilities across context.
>> Correction. That's that's the big the big work at the moment, you know, trying to make enable vulnerability tracking for all branches in a in a project instead of just the default one in a long desired feature. So, that's that's pretty exciting stuff. We just went into like closed beta like a week or two ago.
>> Exciting.
>> Testing out the scalability of that. So, um yeah, exciting times.
>> I hear you've also been working on some daily workflow automation with Open Code. Uh is that anything you'd want to be talking about today?
>> I mean, pro probably. I think that's >> I heard he has a slide deck.
>> Oh, I a slide deck.
>> We act like we don't prepare for these things.
>> Yeah, I'm acting as if I this is not in my notes there. So, is there any chance you want to talk about this thing we talked about you're talking about?
>> Yeah. So I mean um that that was kind of the the interesting part of of um Fatima and my donut chat was um you know these days with the AI revolution like what you know what one minute I'm working on on making vulnerabilities um referenceable across all branches and then I'm moonlighting half the time trying to make all my AI sessions more efficient so that I can spend less time doing my job um and more time making my AI more efficient. I'm sure my manager is busy like freaking out somewhere right now.
>> I don't watch the live stream.
>> Manager of Gregory, if you are watching, hit the heart icon repeatedly in this video.
>> Yeah. Yeah. Uh manager of Gregory, if you're watching, uh this is wonderful.
And if you think otherwise, um you might want to like disappear for the next 10 minutes. Thank you, Gregory's manager.
>> Well, I'm sure he's he's kind of using these. Well, some of it anyway. So maybe it helps him too. Who knows? I It's interesting actually that link to the um the um Singaporean diplomat using that sort of second brain feature because it's very similar to what I've been working on um >> which is like a memory module for in this case open code but I've actually been working to to make it uh more generically available like it doesn't need to be strictly linked to open code as a as an agent tool. Um like in theory um GitLab's Duo chat can can be hooked onto it. is using an MCP protocol. Um, but some of the features might I need to do some more work on you. Big work in progress. Um, but if you want >> as as all agentic workflows are right now, I think if you're using more than one agent in any sort of way or connecting more than two tools at this point, it is horizon level work. So don't don't feel the need to put that disclaimer on anything. I think our audience is always very curious about these types of works in progress.
>> Everything is out of date tomorrow.
Yeah, >> we live in the future.
>> Is it really for you? Cuz it's more like three hours for me.
>> Oh dear.
>> Anyways, um if you want we can, you know, using my nice AI generated presentation.
>> Go ahead and share your presentation.
And I'll just remind the audience that if you have any questions as Gregory is speaking, feel free to drop them in the chat and then we'll start pulling them up as well. Like we have some questions that we've prepared, but we would really love to interact with the audience as we do this.
>> Yeah, Gregory, if you can zoom in maybe once.
>> That's a little tough.
>> Yes, that is way better.
>> That looking good. Okay, cool.
>> That's it.
>> The joys of HTML. Wasn't this what you were talking about this earlier?
Um, but yeah, so essentially what I've built is I called it open code memory, but as mentioned that the open code restriction is not really that uh true.
um to give you some some context obviously working through all these sessions um context is kind of the key uh item that you're finding and I went through quite a significant uh play with that it's noisy um went through quite a sort of an evolution of trying to use these tools and then building up a whole body of like procedures and directives and and things of how I would work on on problems. Um and previously this the strategy was using agents.mmd which I'm sure everyone is familiar with. Um but when I was finding what I eventually found is I had an agent assembly that was so massive um that it was essentially taking up uh like 20% of my context window before it even started working. Um and that's that's a that's not ideal. Uh that's kind of expensive and if not all of that context is applicable then that becomes a a problem as well. So I started breaking that out into smaller files and then kind of had this idea that there needs to be a better way for this to work. Um and so that's where this open code memory um module came about. Um where I wanted a more uh intelligent way for the AI to be able to recall useful context um basically when it was contextually relevant and so we'd get different pieces. So the nice little example here is I say well you know fix the or bug we discussed what orth bug whereas with the module I can say well we discussed it and it knows which one I'm talking about. Um and it's a nice example. So I'm not going to rehash the exact things that I've put in these slides here. Um but it's >> repeated mistakes is is definitely something you don't want to uh want to have to deal with. um you know you you figure out how to use something correctly. Maybe one of your tools is um you know there's a there's an API in GitLab for example that maybe works a little bit weird or doesn't return pageionation results in the way you expect. Um I had a particular case where the MCP tool for fetching uh system or yeah system notes off of a or um issue wouldn't return that the pageionation and page results because those are in the headers of the rest API and so the AI then had no idea that there were additional pages and it would just kind of say I fetched everything which was not true and was really annoying. Um, whereas the GraphQL API would return the next page information and context. So, I just made like a little directive and said just go straight to the GraphQL API, skip the the tool that solves the issue here.
>> Um, you know, but you don't want to have to repeatedly um tell it about things.
So that again, agents.md solves that problem, but you don't want to have to have that same context loaded into your token window, right, every single time.
Um, >> you >> that's a lot to keep pushing into each new chat that you're starting, especially if it's quite long, then I feel like or if it's about different projects and you're like polluting some of your context window by pushing that in every time.
>> Yeah, >> it slows everything down and it's annoying. We we can't discount the annoying factor >> and it's expensive, right? like token >> and expensive >> especially not on purpose 7 um you know >> what do you mean expensive I'm always token maxing just kidding >> if my manager is watching this >> well suffice to say some days I'm glad that that like using the the GitLab provider for open code I can't see what my tokens cost >> wait till open code starts doing what codeex does and gives you a token summary at the end of the chat Oh no.
>> Um >> it's a very much moment of truth moment.
>> So yeah, this this um kind of you know in the interest of not going here forever, but the the the idea that I kind of had there was to build this this background MCP server um that would then start storing all of these memories that were like categorized. things like your directives, you know, tell it how it must behave, procedures, things that you commonly do like do you have a merge request review procedure or um you know how you go through your to-dos on a daily date basis because I started using doing that um I would uh I've been using open code to do that which has been quite an interesting um like uh evolution in my workflow like I don't really open the GitLab UI very much anymore because uh using like open code and these kinds of tools like the context just comes to me and like that's that's been really greatful for for just overall efficiency and and not having to think what should I be working on right now just like oh this is the most important thing right now >> um so yes essentially this background um server is there and that's what like your your your uh agent is able to talk to um and that do does like um two searches across two DBs so you've got like the full text search on SQLite looking for exact matches as well as a lance DB um which is uh vectorized um representations of the semantic meaning of your memories there. And that way like if you're not your memory doesn't contain the precise words of what you're talking about, it's able to use the proximity in order to um find out precisely what maybe you were talking about. You know, there's various sort of like configurations you can make for for the um proximity in that regard. Um and then that those results get merged um in a sort of hybrid waiting pattern where you know if it finds it in both DDs then it must be a really strong match and that's like the most likely most relevant match to whatever you were talking about and that ultimately gets passed into the AI um your your uh message to the AI so it knows what you're talking about and says okay let me go deal with that. Um the things I didn't mention here is the uh open code sort of coding agent over here which I mentioned separately because uh explicitly open code DB here and the demon worker. I didn't like that I had to continually sort of teach it hey you should remember this directive. You should remember that we did this thing.
You should remember this fact. I wanted it to be more implicit. And so one of the evolutions that I did was I actually built a background demon for this MCPU server that then reaches into the open code escqite database and is continuously in the background ingesting those conversations you have and saying hey is there like a useful um piece of uh you know a useful directive or fact that we might want to remember later.
Um, and you know, this is where it gets interesting because I'm kind of running another AI agent in the background trying to extract useful information from past conversations to make future conversations more efficient. Um, so that's that's a fascinating little piece of work. I haven't got that working for other clients yet because I don't know what their DB structures look like or if I have access to them, but it works for open code for now. Um, so yes, um, I kind of already explained this. This is, you know, well practiced uh, presentation, but the the two sorts of searches here and how it weights them and scores like exact word matches versus um if the the words mean the same thing but aren't exact same thing. So token validation JWT expire J it sort of registers that those are very similar in nature. Um so moving on from that I mentioned using the vector store. This is just kind of a very very simplified like introduction to the concept here. Um you as you can see simplified 2D projection of a 384dimensional vector space. So if you ever seen XY Z graphs but just imagine like more >> I love the star.
>> Um >> just so you know.
>> Yeah, >> if you were wondering.
>> Yeah, if you were curious.
>> I do like that the query of token issue is so varied but that also makes a lot of sense. Yeah, it it get tells you that like there's multiple memories that are in that sort of proximity.
>> I am wondering like what is it that you do that so much of your >> so many of your queries are about like what is this index strategy token refresh session timeout ooth statement?
Yeah, this is some scary stuff. Sorry, please continue. Yeah, >> yeah, please.
>> I was thinking about what mine would look like. Yeah. Yeah, please go.
>> Oh, no. I don't know if I want to know that. It's This is This is off. This This is like made up stuff, by the way.
Like um >> Oh, so this is not your actual vector store. Okay. I was wondering like >> I thought you were being really transparent with us.
>> Note to self next time. Include real work. Okay.
>> Speaking of include real work, we do have a couple of questions for you. So maybe this is a good chance for us to pull some of the questions uh from the chat. So we had one earlier uh that has agents become longunning and contextaware across sessions will memory governance become a competitive moat that enterprises build internally or will it get abstracted away by AI providers the same way cloud storage did to onrem >> it's a loaded question >> that's a very interesting competitive mode enterprises build internally yeah >> do you have an opinion on this you could also say, you know, I plead and I don't have the opinion on this, but I do think it's an interesting question.
>> Yeah, we it's one of those things where I don't think it's possible to know the exact answer, but we can speculate.
>> Yeah, I think they're asking for speculation.
>> I think I think it's interesting because um you know, I actually put forward a proposal internally in GitLab about that that subject recently. You know, like um all of this that I'm discussing here is is open source MIT license. I'm sure someone is probably looking at this thinking, thank you for using GitLab's AI to build something for free for people.
Um, but the the sort of thought that I'd had is, you know, what what the direction that GitLab is kind of moving towards now is is being um what how was it phrased? Context as a service, right?
Like the idea is to be to provide a lot of information uh in a very like contextually appropriate way to the AI so that it knows what to use and effectively gathers the data, right?
because the more the time it has to spend reathering and recontextualizing information is token efficiency losses time losses and you're not doing meaningful work. So um I >> right >> it's yeah it's it's in the public project. So like GitLab orbit for example is that idea of indexing the entire project and having like being able to like >> call down to specific functions and pieces of your code um through that like graph navigation >> aspect and and provide that context really efficiently to the AI. And so um my sort of thinking here was I I love this this local memory system that I've built here and I think this this is something that well it's not that I think it's been incredibly useful to to me and and various colleagues have actually picked it up and and been guinea pigs for me and told me how useful it's been for them which is great. Um, but I was thinking what if we had sort of we built in like a memory module to GitLab >> that becomes like your your your memory context for a project, right? Like it knows things like the project, how it's written, the decisions people made, why they made them, and it you now, you know, if we hooked up the system to memory recall from GitLab, then it might be able to say, "Oh, well, you know, John the other day decided not to build it that way because XY Z reason, I think that might be incredibly empowering."
That would be really cool.
>> Yeah, >> that'd be really cool. And I think like you know going back to that person's question, I think what you're saying is we could build it internally because of the way we build things. You'll you'll get to see that happen in process. But as you were speaking, I remembered I did meet a few people um at an event recently where they were working at companies that are building this as an enterprise tool. Um and so I think the answer to the speculation is both. Um, but it does depend on the vision of the company. Like Gregory said, like we could build it into GitLab, but that experience would be very different if like a private startup went and built like a memory module across different tools, which is happening.
>> Yeah, >> there was >> it's not going to look the same where wherever we go.
>> Yeah.
>> There's another question from Akash Kumar. There we go. Do you believe future cyber security systems will need AI agents defending against other AI agents in real time? I'm laughing because I imagine like a video game in my mind, especially when persistent memory and autonomous decision-m become mainstream. We're just giving Gregory all the hard questions today. I really love this.
>> Yes, >> it's not always like this. And you can opt out of questions, just so you know.
>> Yeah, the audience might mock you, but you can opt out.
>> That's fine. I myself regularly anyways.
It's part of being a software developer.
Um, but you know, since we've had this great question, I need to go play Cyberpunk 2077 again. Um, feels cool. Um but no, like I can entirely believe it, you know, like um I'll be honest with you. Uh one of the one of the fascinating sort of aspects that I've realized in this journey of continually working on tools and making my own AI agents work for me more efficiently is there was a moment where I, you know, I've bought like another big wide screen. I've got like four screens in front of me here so that I can I can like run multiple consecutive agents side by side doing all kinds of uh pieces of work. And I realized one day that the bottleneck is me, >> right?
>> Yeah.
>> Which is a a fascinating statement and has um has never been the uh I was asked to zoom in again there and then it did a interesting thing. Well, I hope that works there.
>> A little bit more. Just a smidge. just this Mitch. So >> yes, >> there we go.
>> Yeah, that's >> page was not designed for this certain level. Sorry, but it it doesn't matter anyways. It's just the point is there's a lot um and um yeah, so bottleneck is me. It was this this fascinating realization and so I started moving in a different direction of like okay, how many how can I start moving things to the background?
How can I start like moving in that direction with this? And so, um, you might even see some reference to, >> this is what excites me about harness engineering, not to take us down a different rabbit hole, but it's sort of like how do you move yourself outside of the bottleneck that you're currently in?
And and Manav calls this like how do you move above the loop uh, in marketing terms, but like I think this is really cool because it's like once you start discovering where you are the bottleneck, then that is sort of the component that you should build and then sort of monitor that loop. Um, yeah. And I think that's really and harness engineering I feel has really helped us do that in so many ways.
>> This is I guess this is basically my harness, right? Because there's a lot of >> all of your MCP tools.
>> They're kind of built in as a result of the things that I'm doing. And I I'll kind of describe a bit process as well, but like you'll notice over here I've got these like reminders procedures, right? And that ties into that concept of background functionality that um where you know how can I move things off so that I don't have to be watching it the whole time. And you know there's a little bit of trust you need to like allow in the AI like did it actually do what because now you're not actively watching the stream saying no you're doing something silly please go investigate that way not what you're doing now. Um but essentially I've built like a little workflow MCP of um uh habits that like sort of procedures I do like a deep review. every single time I I make an MR or another person. Um >> I asked I asked for this one to pull up because you just said you couldn't trust if the if the agent was doing what you expected and Patrick said, >> "Hey, these agents can lie.
>> Trust nothing.
>> Trust nothing. Confirm everything." All right. Like one of my one of my directives in there from agents.mmd time already was um anytime you make a claim about like a dysfunctional API or um you know that there something being a known issue I want evidence. All right, find me the issue. And nine times out of 10 it's like I made that up. I'm sorry.
>> Oh yeah. Yeah. I think we've all been there. Um looks like we've got another really good question from Akos. Uh and that is I'm currently developing my own AI system with memory persistence and adaptive behavior. From your perspective, what architectural decisions are most critical to keep autonomous AI memory both scalable and secure against context poisoning or retrieval drift? We promise to only give you the easy ones.
>> That's a good question.
>> These are some of the hardest questions.
>> I know. Oh, yeah. No, no, that was I don't know if my sarcasm was as apparent as it should have been. These are You've gotten the hardest specifically audience questions.
>> We need to give you an award.
>> Oh, yeah. We'll just call it the Gregory.
>> Yeah. Um, so, uh, these are these are >> I don't know if your manager is still watching. Maybe a permanent slot, but sorry. Go ahead. Go ahead. Answer the question.
>> No worries. probably like these are these are problems even I'm figuring out on the fly here you know like half the half the things that I'm talking about in this presentation didn't exist like two weeks ago right so I am I'm flying through this as quickly as you are um most critical in terms of scalability and security I it's it's a fascinating question of um you know engineering principles that we're familiar with from software development still apply to what we're doing now like the amount of times I I had was building the AI tools with the AI and it would build them in peculiar ways where like um I would build in background operations and it built a loop into the MCP, right? This is why I have that background demon, right? So, it built a loop into the MCP and then you would have multiple sessions both trying to use the memory module and one would then become like a blocking action and all the other MCPS trying to use the memory module start timing out. You're like, >> "No, >> no. Uh that's not how we build like scalable software. That doesn't work."
Right? So, um I think the the key point is just like your your your core underlying software development principles still apply and uh thinking about the the design of how you're doing this is probably what will make them scalable into the future. Um >> so, you know, I know that's a bit of a vague answer, but I think that's the best one I've got in terms of scalability. Security-wise, gosh. Um so, uh I'm I'm maybe a bit bit of a cowboy. I'm sure I'm sure some someone on the security teams if they're watching this is going to be like I see a target. Um inside of GitLab, right? But um they do amazing jobs. But um I I like >> I I got incredibly tired of permitting every single action out of the AI. So I I I give it a bit more free reign. I let it execute some commands, everything.
>> You know, I have to admit I I I too use always allow sometimes.
>> It's it's it's a trade-off. It is actually it is >> honestly a trade-off between um security and speed, right? Like you you security is never convenient.
>> Um and so you kind of end up in a situation where you need to need to decide like how how do you want to put on your sort of reins and such in order to make sure that like something uh bad doesn't happen. Um, and I think that is >> or like think about like what's the worst that can happen in this particular application and is that worst quite painful and so a lot of times if I'm doing like demo throwaway applications then the worst is not very damaging and so it's okay for me to give it more permissions than than security.
But I think if I was working on something more like sensitive and production systems that would be that would be tricky. Yeah, I think it's it's interesting because this this it doesn't just fall just purely on on us in in sort of our gentic systems, but I think it it kind of goes back to core AI de like the actual LLM developments where they're doing kind of trying to make an a an LLM uh do mischievous bad activities and then the solution that I've seen a lot of them is kind of having as a conversation is the antagonistic LLM that is then watching act the first LLM and uh and kind of saying like is it doing what it's supposed to be doing? Is it following the goal alignment? Is it getting up to mischief there? Is it running scripts unsafeely? And I think that is probably how you would scale security ultimately.
That's probably the only way you can meaningfully is somewhat similar to and I'll I'll explain this is further in the presentation that the kind of interjection to the workflow of the AI.
You kind of need something that can interject and say is there actually appropriate safety? And I'm I'm I'm almost positive it's going to end up becoming that that Cyberpunk 2077 arms race of like the AI.
>> Um this one's trying to do better than that one.
>> Um >> so I'm sure by you know it's out of date by tomorrow as I joked earlier.
>> Yeah. But I but I think the thing you mentioned about the adversarial agent is is quite >> ind is is something that even I do with even plan mode where I'll be like one agent will make the plan and a sub agent will will do an adversarial plan check or like a refactor but like adversarily.
Um so I think that it's current but yeah like you said it might be out of date.
When do you think uh the speed at which development moves or if ever will slow down? Uh because right now I think yeah so many things are out of date by tomorrow. Uh the industry 6 months ago uh your workflow is completely unrecognizable. Do you think this will keep going forever? Do you think it'll get even faster? Or do you think it'll be hour to hour or do you think that after a hyper warp period it will go back to being out of date in a week or a month?
>> I I think we're going to we will we will probably hit plateau is maybe not the right word, but I think there will come a slight slowdown. Like right now we're we're in like the the the wild west of the of the AI digital era, right? Like nobody nobody's been here before. We're building things sometime sometimes you build things thinking it's a completely unique idea and then it turns out someone else has already made a package called open code memory and you've got to call yours open code semantic memory.
I don't know who that was. It wasn't me.
I pro >> um >> you don't remember. It's out of your memory.
>> It's good to know other people are trying to solve the same problems I'm solving, right? Um that's I should probably should have Googled first.
Um, but I think mine's kind of become very GitLab ccentric over time anyways and very personalized. So, you know, it's going to become the next JavaScript there. There's 21 standards. Um, but what I think will happen actually in this I have I have this conversation often is we're going to hit kind of plateau as a result of um like you're going to hit a point where you can make the AI do bigger and bigger and better things, right? But it has a cost an intrinsic cost and people people also have their intrinsic cost. And right now like AI is is is kind of pushing this this the value of code generation down so much that okay we don't need as many developers maybe you know we can we can generate more code we can push things through right um but you're going like to get those those higher higher level context things um you you kind of need to run the AI more make it think more you know thinking mode was the big agentic like work through um so that was essentially uh where we got to Um but that's costing you more tokens and so you kind of hit like a breaking point um of uh where people are kind of like now the cheaper part of this conversation and until like we bring the LLM costs further down like you're going to kind of match those things up together. Um and I think that's kind of going to be the bottleneck of uh pushing things forward essentially.
Um, right. So, uh, we in the interest of time realizing we were chatting.
>> Yeah, let's let's get a few more of your slides.
>> Yes, >> they're kind of just demonstrations here. So, you know, we probably won't sit on each one too long here, but it's kind of this, you know, the first iteration was the memory recall. You know, memories have been ingested. Okay, please try this out. It's nice and quick because it's all cached in memory locally. Um, so that that interjection of memories into your um your chat is actually like quite nice and quick. And so, you know, you fix the orth bug and then it says, "All right, well, I need to go and see what I recall." And it does those two searches, merges them all together, and bam, it not it gets those this sort of different contextual memories. It maybe a decision that was made or a fact. Um, you have different categories.
>> Um, and then later on I found I was not satisfied, right? because the AI still had to recall remember what to recall and the problem there is it had to know that there was something worth remembering. Um which is a fascinating new conversation. Um so essentially like you had to load it with a whole bunch of things early on agents.mmd style almost to say hey there's things worth remembering and I was not satisfied. So I said well can we hook into the life flow and um the the workflow system here and take when you interact with the AI take that um interaction, dissolve it down into keywords, search for memories, and then inject that into the conversation as you're having it so that the AI knows there's things worth remembering in that exact moment. Um, and so like the new workflow feels a lot more natural, right? There's no, "Oh, let me recall what I know about this." You just say, "Fix the or bug and within there it extracts like context from that, embeds it into the the the um interaction."
>> Okay. Injects it into the system prompt.
Okay.
>> Exactly. And then the LLM now just happens to know and it it feels incredibly natural what that conversation feels like.
>> Um some other things which oh this did not I did not zoom in this much when I made this page. Sorry.
>> No worries.
>> Other things in here like a knowledge graph. So so I made trying to make those that memory system more and more useful like when you start up open code in that directory part of its boot initialization is it will actually go and index the entire project into the memory graph for semantic memory. So now when you talk about a function or a class or something that is one of the memories that can come up and it knows exactly where that file is and it might have linked memories that says this is why that class was written or this is the context of how it works.
>> Um again like that speeding up of context. Um, and so sort of a demonstration of a typical morning session here is I I kick off my my session and I say, "Hey, um, and the first thing it does is it loads in what I call a boot context, which is a whole bunch of boot gates of like decisions it should do or things it should remember, blockers, reminders. Um, and then I've got this thing called a full refresh where I get it like to fetch all my todos, all my issues, all my MRS that I'm busy with, get all the discussions and the notes and tell me what's changed, what do I need to work on, who's pinged me, and then it has a I built a prioritization rule set so it knows which things I want to work on in what order. What's the most important thing I need to be doing right now? And it gets those and puts those all into procedure order. Um, and so in the background, it runs this full refresh that that's that kind of running a session in the background. Um, and then it kind of returns and says, "Well, I see you've got three MRS that need a review." And so by urgency, um, it knows we've got this one that's a P 0. And so I say, okay, let's work from the top. It contextualizes it from the memory. It recalls like the review procedure. Um, claims the item and this is relevant for just now. Um, and then we start working on it, right? And that's kind of how it just brings all this context directly to me. I don't have to think about what I need to work on. I don't need to think about like which thing is the most important thing for me to be doing right now. like that's now part of my AI work.
>> It's because you've already sort of built it into the system what your priorities and importances are. I think that goes back to what Colleen and I were saying earlier about like a well-designed system will work well.
>> That's really cool.
>> Interesting.
>> So that that that's that's made honestly made working fun because it's never like oh you know I'm I was a tab monster 100 tabs you like.
>> Yeah. I mean even when I go to my to-dos there's like 99,000 of them and so then it becomes like the cognitive load of how do I prioritize this? And it's great that you've like taken the time to set up like what your priorities are and then the system works really well as a result. Did you find that like at first it wasn't quite it and then you had to sort of adjust the the different prioritizations?
>> That but was surprisingly a a first time hit, right? Like I kind of just built knowing what I wanted and it was like >> great here it goes. you know, there was a little bit more iteration like refining like this is what I want my my like initially it was kind of like this is a P 0, this is a P1, P2, P3 and then later on I'm like okay now we've got some additional I want them within those categories oldest first, right? Because otherwise new requests are always getting added to the top and I was never getting to things and people were like forgotten. Uh, and then like this little extra rule that like if something goes if one of those todos kind of goes seven days out of date, then it gets put up to this P 0. So nothing goes too stable, you know, kind of like cycle through things like that.
>> Um, and so kind of one of the last items here, um, you know, we're nearly at the end is I talked about having multiple sessions all together and they would step on each other's toes. They would open like, um, the same todos, they'd work open MRS, they'd mix the code together and it was like complete madness, absolutely unhelpful. And so one of the other things that I built into this whole system is like a session claiming like system. So here I've you can s sort of see the representation in the background of um that conversation we just had. And over here I say well let's work on the next item from the priority queue. It already has a context injected and when it tries to get it sees oh that's already claimed. I can't do that one. So let's go claim the next item. And we start working on that. So now I can have two open code sessions and just work on those. And I don't have to tell them what the other one's busy with. Don't do that. Don't touch that.
don't break things. Um, so that's really helpful as well. Um, and yeah, this is just kind of like how simple it is to set up. Uh, I've got like a little is full script, but as always, review scripts. Please don't my agents for >> Don't trust Don't trust me. All right.
Don't trust anyone. Don't trust me.
>> Review the script first. That's just good practice these days. Um but you can pip install or n or whatever and then like >> that's configuring the MCP is as simple as adding your open code from and plugins wise >> um you can um that's why we got memory at the moment but yeah that's how you get it all mixed in and get the project active injection >> we had a compliment for you about a really interesting point about memory and context happening in real time it's a real loss in token usage when you're deep in a session and the LLM stops remembering so great thought on solving that hashtag priorities and importances 100%. Uh so you have some fans in the audience with a bunch of questions. Um >> I don't think we'll have time to address all of the open questions right now just because we do have a time limit. Uh but we have some great ones in here. Uh >> yeah, what I can do is I can send you a link and you can maybe respond to them on LinkedIn if you don't mind. just got two really good questions in there that are asking for your opinion.
>> And I would also say any questions we don't answer this time, we can dive a little bit more into them in future episodes. Uh however, >> I mean, if Gregory is so popular.
>> Yeah. Now, I'd say we wouldn't be able to do it next uh next month because we have something else going on next month. Uh which leads us into the last thing we wanted to talk about. I believe >> that's right. Uh before we wrap up, uh we wanted to give you a big preview of what's coming next month. Um but we can have Gregory back on the show the month after if you would like. Uh Colleen and I will be in London on June 10th and it's shaping up. GitLab transcend uh is shaping up to be a really exciting event about anyone like all of our chat today honestly who are seriously thinking about how Aentic AI is changing the way we build software.
Yes. Uh it should be extremely cool and not just because we'll be in London. The theme is about intelligent orchestration, what it takes to build with more context, more automation, more confidence across the life the software life cycle. Uh we'll have some live demos. It'll be a little bit different of a format than we usually have. Uh and by that I mean so many more live demos.
and instead of one guest, we'll have three.
>> So, please join us on June 10th, uh, and tune in for that.
>> Yeah, if you're in London, you can join us in person or you can join us virtually. We'll drop a link to register in the chat. Uh, join and you'll get to see Colleen live on the developer show from the show floor as well. And so, I'm really excited about that. I will be behind the scenes cheering you on, Colleen.
And then to wrap up our show today, we talked a little about 19.0. We talked about why memory is so important, gentic workflows, and we got a deep dive into Gregory's open code and persistent context in real work. Uh there's a comment in the chat that says, "Gregory is that guy. What can I say?" And I think that just means like we got to book you back in.
>> Yes.
>> I don't know if you're looking at the chat, but we've got lots of thank yous for you.
>> Yes. This has been fantastic. I know I learned a lot and I think our audience did too.
>> Greg, I think you wanted to do a hi mom.
Now >> it was already the last last slot there, but yeah, hi mom.
>> I think now it's bye mom.
>> Just before the stream here, I was like throwing on like links to like friends like guys, I'm gonna be on the stream.
>> And the stream is available on LinkedIn, YouTube, and uh X after the show. So the video will live on so you can feel free to share that with friends and family.
Uh that's all we've got for this episode of the developer show. Thank you so much everyone for being here. Thank you Gregory for all of your expertise. Thank you chat like for bringing the hard question. Uh and we'll see you next month live from London.
>> Thanks for hanging out with us. And if you have any questions or future topics, drop them in the chat.
Bye.
>> Bye.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











