Large Language Models (LLMs) do not have built-in memory; every API call starts with complete amnesia. To create conversational AI with memory, developers must maintain a conversation history list containing messages with roles (system, user, assistant) and pass this entire history to the model with each API call. The system role defines instructions and personality, user roles represent human input, and assistant roles store the model's previous responses. This creates the illusion of memory by providing context in every request, enabling chatbots to remember previous interactions and maintain coherent conversations.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
E02 -AI Agents (Messages, Roles, and Conversation)Added:
Everybody thinks that chat GP remember conversation like human. It doesn't. In fact, every API call start with complete amnesia. Today I will probe that in code. Show you what it what why it's happen and then build a real real chat with memory in less than 25 lines of code. And once you understand this, you will finally understand how every chatboard actually works under the hood.
Hey everyone, welcome back. My name is Nathan Kumar. In the last episode, we made our first open AAI call. One question in one question out. But real AI applications are not one sort questions. You want conversation back and forth like chatgity.
So the question become how do we give the memory to our model? Let's uh see through the code and the demo.
So in the uh so quick recap in last episode uh we g we gave one in one one question one string to the model one input and model things one API call and then one output one spring simple but the conversations are different you ask something model replies then you ask a follow-up question so let's try that let's try through the uh where we left into our code so let's write the code uh in in this one we imported our open library client create open client let's uh create a response how we can how we are going to call the openi so we first do the let's say we print our first open area All we create it. We get response one = to client dot responses create and what are the things where we are passing to the model. First we'll pass the model name that is our uh GPT GP5 mini that we uh even imported the last one GP5 mini and what we are we are going to input input is our question hi let's just ask okay hi uh my name is Nin Nice to meet you.
Nice to meet you. What model will reply?
Let's see if I fire this query um to the model and if I just run it first it's happening the first open a call and let's just print it what it's get response. So response one dot output text and then just print it. Let's see how it say hey this is h nice to meet you too. I am Chad GB how can I now okay we uh we got an output perfect the model replies normally. Now let's do the second API call uh response response to uh client uh client dot responses. So we need to focus where we are where we are what instances what object we are calling it and then create uh we first define the what we are calling it. Let's just copy it from top make it faster.
So and uh let's ask the question. So if I ask what is my name? If I ask this question and then let's just print it how it respond. So I'm expecting it will uh give my name different. Just print it.
Oh, now there's a shocking part.
Just told it what is my name and literally 2 seconds ago. It's literally 2 seconds back.
This is most important mental model uh in AI API.
Every API call is independent. OpenAI server do not remember your previous request. Think it think of it like a genius stranger with Amnesia. They can answer anything like anything everything but they forgot the previous conversation like on calls immediately.
Now let's let's understand what is the problem. So we we know uh what is happening internally. So we called our first API my name it's replied nice to meet you model replies very normally.
But when I asked just right right after uh my what is my name? He said I don't know your name and just told it. So by this thing we we got it we got an idea the model the API we are calling has no memory no stored conversation nothing.
Now let's let's move to the how we can tackle with these things. Uh so then how if you think like okay but in chat GP I I I I see the thing it's reply it's reply it's reply what I'm talking it's get a context and then uh give me the answer.
So it's it's simple. We send the conversation back every single time.
That's the trick. We don't we don't make any such sort of a different API call or something. No, we create uh this as a conversation into sort of a a list that I will show you in the code and then send back to the openi model and ask.
Okay. So now this is this is the thing.
Okay. As as we are doing in the practice. So let's let's uh run this in our notebook. How is that work? So clear now Python I would say let me check what is the my uh okay 01. So Python 01 how it's working with no memory. uh first time calling my call one telling the model my name and say hi Nathan nice to meet you how can I help you as as we expected and um if I ask it again so I don't know so that is the expected but just to get more uh feeling of writing a code production protection level so I'm running everything in note and then running in uh terminal so just for get uh better okay uh better better understanding and better comfort okay so as of now how we are going to pass our conversation every time to the model.
That's the question. And for this thing we have a very important concept is messages. Messages and their role. So what are we are going to do? So if you can see there are three things system, user assistant. Uh so to do this we stop sending a single string instead we send a list of messages and every message has a role. So okay first is a system role.
So system will like okay you are a helpful helpful tutor you are a scientist or you are a education or you you are you are some uh some scientist or you are some data engineer or something you gave a rule the system your instruction to the model and this control models personality behavior tone rules second one is the user user message so how the this is the sort of human speaking so what is the human like I typically ask okay what is my name what is 2 + 2 so that's the Second uh sort of a thing what is the human speaking. Third one is the assistant role. This is the model's previous reply and this part is super important. So this is how the model remember what it said earlier. So now let's let's jump to the demo and understand is this part. Uh we have this model already defined. Uh let's go to the my code two and uh how we are going to add the memory to our model. So we already created the client.
Let's let's first build the sort of a conversation how we are going to pass it. So conversation will be a list as I mentioned. So I'll create it sort of a dictionary and then uh I will first assign the role. Let's uh make this coffee. So we are I'll pass two roles uh just for uh better.
Okay. So my role and uh first one is like a we define a system role to our and uh how we define it we we pass the content and what is the content we define what it's you are suppose you are like a you are a friendly tutor brief for optimizing the prompts. we have B.
Okay. Now let's define our second role that we said.
So second will be our let's say our user. Second role will be defined as a user and then what will be the content.
So content and and content will be uh hi my name is Nathan. Let's pass the same message that the same um idea or the same. Okay. Now I I I created this conversation. This is my conversation created. So now let's let's pass this conversation to our model.
Okay. Just for better thing. Uh let's take the response and we'll create a response instead of it. and uh instead of input uh a single question, I will pass the conversation. Okay, we we call the same model, we call the response.
And now let's see how I'm getting it. So if I just run it and let's print it uh print same response three let's because we are getting response three. Hi Nathan, nice to meet you. I'm Chad GBT your friend.
How can I help you today? It's it's till now pretty well pretty simple as expected it's working now it's uh what I what I got the reply so let's just put into our reply new variable create a new variable reply what I got put into a reply variable and now how I going to append this my reply to my the conversation so I'll create uh conversation dot append and now most important part I need to give my model the context what is this reply from where it's a human user system or what so before that I I need to get okay user user set what nothing let's say user is my assistant and what is my content content is my that is the that is my uh assistant reply uh content um content and what is the content is the reply that I got from the my the response from the ele now I appended same conversation to my system so okay now if I see this conversation um okay let's print this conversation if I print it okay now system you are friendly today be brief my name that's a reply I got and uh that was the content so I append it same user is assistant and This is this was the content as user reply and this is the answer I got from the as a from the assistant. So if you see role is first system user and assistant. So in first I defined the role second I asked the question that is a user and third is my assistant that I got that I just defined. Now let's ask the more more question around around it.
So okay now same again we append our conversation conversation.append and uh what we are going to first so we need to say a dictionary and uh if I am going to ask the question so my role will be the user because I'm going to ask the next question from GPT and same what will be the content so content okay let's ask what is my name now what is my name and uh okay if I ask the same thing what is my name and I created I I current uh created this addit this uh new sort of a message to my conversation and if I now again call this uh response so let's create the response for and call it because I append okay so it's something we missed in our format so let's see where we missed it conversation append and there is a rule resistance variable.
Let's just see why we are messing it up. Messing it up.
Looking perfect for me.
If I run it, bad request error record message invalid value supported assistant system developer. But I think I have typed something else but it's just system assistant and content.
Okay, let me just maybe content assistant.
Oh, okay. That's a mess. Instead of role, I define the user. Sorry, my bad.
Uh, now add the same thing conversation.
So, system system user assistant user system assistant. And if I run it, so I'm getting response for still a bad request.
Okay, maybe the user assistant and then content is this. Okay, so let me just figure it out. Um, okay, let's just create from the start.
So, okay, I have this already copied.
Instead of wasting time, let's do this.
So, what is my conversation?
Conversation is a system and prompt. So just do it system and role and role is system and the user and if I run it let's uh just get the response and append this uh reply to my the conversation what I got here reply one as assistant okay I have I'm hoping getting the same format okay got it and now ask the same question Um what is my name? Append it in the same one and okay so now I got this one role user content username. Okay now if I fire this one okay maybe I'm I was messing up with some format and uh if I give if I check now the what is the response I'm getting? Let's just let's just print it.
Okay. And let's say if I print it response 4 dot output text your name is anything. Now now this is the expected thing that that that we thought it would understand how how what what we discussed earlier in the in the conversation how it's recalling how it's recalling it because we are passing all text as a conversation to the model.
Okay. So now now we get an idea. We got an idea. Okay. We sort we send hold the conversation. Uh it's it's more like a uh it seems like enter conversation as a fresh input to the every call. Your program remembers uh by keeping the list around it's just doing it. Okay. Now let's run to the uh let's run by this uh terminal. Why I'm doing it? You will get an idea why I'm doing this by terminal.
It will be very important. So it should it should reply in the as expected format that we got.
Yeah, it's it's reply same conversation with history and your name. Now now it's come to how the memory actually works. So in turn one we pass you are a tutor and and I I like I I message as I'm Nathan and then uh it it in turn two remember you are a tutor user in answer assistant hi Nathan and your name and then time three that's a whole list keeps growing every time and every API uh call reads the entire conversation from this so it's not actually memory it's the illusion of the memory that we are doing it so okay now we we we are good with okay we are we are running with one memory one one one one things one LM calls that's fine but as we as we uh discussed ear how are chatboard going to reply because I I'm I'm not going to add this conversation data into my system and ask okay what is your name and everything now let's make it more uh more better system so again let's uh create the conversation uh so we just pull it from uh previous one so it's sort of the same same format so conversation role is the system and you are a helpful assistant who keeps our reply sort uh no more than sentence unless user asks for some detail just to keep it more compact. Okay. So now we create a sort of a while loop because we we are going to we are want the continuous answer. So we create the answer loop and then it get a user message uh let's say something uh user message equal to input of something you set whatever you are going to say and first uh you ask something from as a input to the model and if uh note user message then continue otherwise uh my bad otherwise it will again messes us mess up uh let's say uh if uh note the user message continue else what it do if uh so but first it's it's it's important to add the uh precondition. So we'll add if user is saying uh quit exit or uh something let's say even the buy uh let's say user even say the buy uh it should it should stop and how we are going to append the conversation so we have the already the message conversation. So what we will do uh we add the conversation role is the user and the content is the message users the because we I am asking the question I have not uh initiated the lm call and now I I have already added this into my conversation now add this to pass this conversation to my lm model that we are using it uh from the start and whatever reply I got I I will keep into my new variable as a reply that is we already discussed response dot output text that will I will get apply. Now most important part how I'm going to handle this reply into my the existing conversation. So again same part I will append this conversation into my uh this reply into my conversation list and then role will be the assistant because this is the reply I got from my system. So this is the assistant and content will be the reply and I I can I can I can say okay print uh reply something I can I can just print it uh reply. So this is sort of a very simple loop. Now it will not give a very good intuition here. Let's go to the terminal how it will talk uh in very natural format. So I have the same code um into my terminal. Let's clear it and get 03. So it is the same uh this conversation while loop and printing reply. So now let's run it. it will give me a very good a realistic scenario how I'm going to use it. So initially I asked okay uh I am a tutor okay I I I I gave something my contact I'm in tutor Nathan Hello Nathan nice to meet you how can I help you today it asked me uh and okay What is memory in um llm? I ask this simple question and it's it's it's going uh continuously uh it's it's not uh it's not just stopping it and reply. So it give memory first how information retain and all. So it's it can help. Okay. Uh what is so talking me as a more more realistic manner. So it's not just ting one reply and all. So we have created sort of a uh chat vote. So it's it's giving what I'm I'm asking question and uh so now ask what was my job role?
Let's ask let's ask in the conversation if I ask it maybe we can do some this this can fix it. do introduce yourself as AI agent tutor role is conversion a user playing an AI tutor who speaks speaks who ask about the LM memory it's it's first get an idea uh what I defined earlier and then based upon my conversation all the convers I ask two questions so it's it's get a summary and give me okay you are uh sort of user playing an AI tuto role and who speaks about the memory and LLM that's the two question I have asked because of this it gave me like this okay now I'm going to put very simple I just type a buy and it it will quit uh goodbye take care and okay so instead of buy I think it's taking literally by as a string uh so okay I I click Q and then it stop uh so it it worked pretty well now uh let's get an idea so today by this demo and the and and this API calls what we have understood is uh the model has no No inbuilt memory. No built-in memory. Uh conversation are just list of messages.
Roles define who said what. Memories created by sending conversation history every call. So this the single idea power chat GPT cloud AI and customer support whatever you are you have listened the editing thing. Okay. So in this in this episode we we will uh in in in the in this episode we have learned this concept. So in the next we will go more deeper. Uh so if this finally made AI memory click for you subscribe and continue with us series and if your code along with these demos you will understand AI system faster than 95% of people learning AI ch AI today. See you on the next episode.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsβ’2026-05-29
Long-Running Agents β Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsβ’2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K viewsβ’2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsβ’2026-05-28
BREAKING: Microsoftβs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsβ’2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsβ’2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsβ’2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsβ’2026-05-29











