Installieren Sie unsere Erweiterung an, um sofort in jedem Video zu suchen

Build Claude Code From Scratch - Full Course
Hinzugefügt: 2026-05-14

178 Aufrufe271:00:03vukrosicOriginalveröffentlichung: 2026-05-07

This course provides a rare, high-signal deconstruction of agentic workflows, moving beyond simple API calls into the rigorous systems design required for autonomous software engineering. It effectively bridges the gap between being a mere consumer of AI tools and mastering the underlying architecture of modern coding assistants.

[00:00:00]Welcome to build claude code from scratch full course. There are 15 lessons. You can check the GitHub repository below and we will go from everywhere from building the coding agent planning repo mapping uh reading files bash commands everything. By the end you're going to understand how cloud code works and you're going to have a working instance of cloud code that can run. This can also be used as your portfolio project in the school community. We will also have a four-week project where we will build all of this together and we will all help each other. And so this is the first week task. You can check below the video and we will build it all together in the school community. Let's code claude code from scratch. I'm just going to go to the repository this repository on GitHub and uh click the first link the first lesson. So this is week one lesson.

[00:00:56]First let's understand the workflow of coding agent. So when we're building cloud code this is what we're going to do here. So there will be user message user request. Then second step is to inspect uh the current repository context all of the code.

[00:01:13]Then it will choose relevant commands and tools. Run one agent turn with those uh tools with all of this knowledge.

[00:01:21]Store the transcript. This is just the history. So just store the history of the tool usage, thinking messages, everything and then report what happened to the user or keep looping. So this is the standard cloud code loop.

[00:01:35]If you want full open source equivalent of cloud code, you can check this claw code ultra worker/cl this repository 190,000 stars. It's in rust but the logic is same. So every turn needs more than just a prompt. It needs uh these things. So user request repo context tool surface a transcript of what happened and what's going to be saved for later and result uh the answer the patch the code written and all of the thinking process is kept. So we don't delete like intermediate thinking process. We keep it always. That's what LLMs do. It showed to be better to keep the thinking process for performance.

[00:02:24]The first useful coding agent question is what does the agent need to know before it edits anything. So these are the core ideas that are used. So prompt for example user said sends read me and explain the repository.

[00:02:43]So command tool turn transcript we already explained those. If we take a look at the claw code if you run these commands we can see that there are some files like the CLI entry point is the main.py runtime route routes prompts and bootstrap sessions. There is command spy that loads commands inventory and tools.py that loads tool inventory.

[00:03:08]These are examples of how you can structure the repository. Let's see some examples here. So we have you can write these if you are building this tool. So you can say uh class tool name of the tool and description. So uh then we have route it has prompt and match tools turn result and transcript of the path.

[00:03:31]So this is good beginning to have. So uh turns can be appended and it can replay the previous things. So this transcript will just return all of the previous turns, all of the previous data. So after the agent edits files, runs tests and does all of the work. This is all stored in the transcripts as history. So now let's add some tool inventory. For example, you can do tools equals to and then just instantiate this class. Give it names and descriptions. So read file, edit file, run test, and describe.

[00:04:09]And then we need to select build a router that's going to select one of these tools. So given the prompt which tools are relevant based on the name and description. So one of those tools will be selected by the large language models themselves. This is a mock example. So let's say the message is or or the prompt is read uh read me and the model might generate that it needs like read me file and then the path to read me or if the prompt is run test the model might generate uh this tool. So the model is choosing this is just mock function. So it's just a quick function that you can use to quickly uh build around it without needing to always actually call the LLM because this is a lot faster here. So model returns structured intent. Runtime decides whether that intent is allowed because models are uncertain. They may make mistakes. So you do want to have a small validator. So if the tool call is none, it will not work. So this function is called validate tool call and we have let's say allow tool names and then it might be like unknown tool or tool requires path or or requires some command etc. And by the way guys you may actually even build this cloud code with cloud code with codex. So maybe you don't I don't think you need to code this from scratch. Here we are uh more focusing on like the system and the design without uh knowing without needing to learn how to like code everything from scratch. So but build from scratch instead later validation will include permissions, sandbox rules, path checks and human approval. So let's see run one agent turn. So the class is coding agent.

[00:06:04]We're going to initiate a transcript that that's going to keep all the history. So at the beginning of the turn, we're going to append user prompt to transcript. Then uh let's see, we're going to uh simulate LLM call and then we will check if there is error or there is a tool call or there is message returned by the model. It can be either of those and at the end we will just append all of this to the transcript. Now we did not execute anything yet. We just appended. So we just got the result from the model maybe a tool call or prompt and then just appended it. We did not call the tool yet. Now if you have this you may uh try three different prompts and see if it works.

[00:06:52]Let's understand stop reasons. So every turn should say why it stopped so completed. Maybe the model just returns some answer without needing to use a tool or requested tool or uh maybe we have some error with the tool call here or blocked for some other reason maybe not enough API credits or something else. So you can have multiple different or different u stop reasons here. So you may put all of this uh code into agent.py pi one file and this is what it can contain if you're building this. So don't spend too much time. Listen, just um make sure that you finish. Don't overwhelm yourself. Don't do everything.

[00:07:36]Make sure that you do this thing well here the first lesson.

[00:07:41]And here uh this is the checklist that uh you may check. If you didn't if you cannot do all of them, it's still okay.

[00:07:49]Do as much as you can. And if you are in school community, you may submit uh what you did here and we will uh review it and give some advice. So let's take a look at the first important thing building the repository context. How do we make the coding agent take context of all of the tools and understand the repository your code that it's working with? So the model chooses tools better when the runtime gives it a repository context. So, we're going to uh create a tiny repo mapper that will scan the project, ignore noisy folders, identify likely entry points and tests and uh render a short summary. Okay, so it needs to first know where the repository is. What top level folder folders exist? Which files look like entry points that it can understand better about repository? Which files look like tests that it can test the repository or new code? And how large is the repo? So these are the main questions it should be able to summarize. To build the repo map, we're going to use data class and path. Data class lets us create small structured records. You will see later like the data and path gives us safer file system paths than raw strings so we don't get accidental bugs. So uh create the class repo map and it's going to have a path to the root folder top level this top level of files total files etc. So coding agent should not waste context and time on cache folders, dependency for folders, g internals. So we can actually create a list of things it should ignore like.git.

[00:09:41]Nvy cache etc. So these are this if you are doing coding you will recognize these as just some random useless things useless for you but useful for the uh computer for the execution of the program. that you don't need these, you don't touch these.

[00:09:58]So we can create a helper function for ignoring files. You can do it in this way.

[00:10:05]So if um anything in so okay so first I need to explain what path.parts are. So if you have some path to main.py it will split every single one. So repo source main it will all be split and if any of these any of these is one of these that we are ignoring then it will ignore the whole thing. So for example, if this is inv ignore this everything after for example this config file will be ignored because it's inside of g uh folder but here repo cloud code agent pi this will not be ignored because there is no any ignored uh folders within this path. Now let's write the path scanner the file scanner. So this is the code but let's let's start from this line first. So this is just an empty list of files.

[00:11:11]So we will just take a list of all of the files every all of the files in this repository. Everything except the files that are inside of ignored folders. So this is what this function does. So it returns all of the files. So path to each file except all of the ones that are ignored. So here ignored.

[00:11:32]We also want to see quickly top level files, the most outside files within the repo. So those are like readme and some folder names. Those folder names are usually describing what the content of the folder is. So top level files are very general usually. So here in this function, we're going to have directories and files. Those will be uh just list of strings, their names, and they will start as empty.

[00:12:00]And so we will just uh check all of them only the top level. So not go inside of the folder just top level inside the root. And uh if they are in ignore if the name is ignored we will continue.

[00:12:15]And if the child if it's directory we will append to directories or files.

[00:12:22]So root.

[00:12:25]Will only look at direct children, not the whole tree, not children of only the direct folders and files within. And sorting helps us. And we will uh return the tpples.

[00:12:38]Now let's find the entry points. Entry points are good because they are uh files that start the app CLI package.

[00:12:47]For example, we can uh create some array or object main.py, CLI, app.py. So these are likely entry point names and they are useful because they can show highle data uh and description and what the repository contains. I'm going to skip ahead a bit because this is same logic.

[00:13:06]I don't want to bore you. So this is easy to understand.

[00:13:10]So usually you may use relative paths because they are shorter and you don't need absolute paths and they are more portable it says here but they are shorter just take less context and less confusing. So after writing some screening functions for entry points as well you also want to write them for tests. You can check this file. This is the second lesson so you can check this.

[00:13:32]I don't want to like bore you too much.

[00:13:33]So let's go to the next thing. You may also just tell your cloud code codex to write screening functions for tests uh etc. Uh we want to use these quick screening functions because they are very quick. They don't need like llm to read everything and then later we will uh use the llm for more detailed exploration or etc. So you want to now put all of this information into some text and give it to the model. So uh this is how we can put it together.

[00:14:05]So it may say repo context and then uh root total files.

[00:14:11]So we just uh plug these numbers. This is very easy. So model context should be useful not endless. Don't give everything. You can just cut uh at 10 if there are like too many things of something. If you're adding like sections or too many values and stuff.

[00:14:27]So for example uh top level directories you may just want to add the first 10.

[00:14:32]you don't want to like add endless later you can add additional functionality or the AI can actually just uh list it itself by using bash commands. You can also add a small tool command tool for this uh context mapping. So this command cloud code.reo map when running Python it will just build this uh repository map. call this function and call up all of these functions and just build a whole map. I know my background is a bit crazy guys.

[00:15:05]I am in Singapore recording this video right now.

[00:15:10]So right now week two we built uh repo map.py.

[00:15:15]So it should contain all of these things. This is a quick test if you want to run.

[00:15:22]This is the checklist. Even if you like cannot do one or two of of these even three that's okay. And if you are in school, you may submit this work so we can uh review and maybe help you out. So just make a post or comment below my post. Let's look at the plan mode. So but I will actually go quickly over this because you may choose not to use this.

[00:15:44]Some people never use this. You just chat with agent and then uh but let's take a quick look. So uh here we want to say normal mode and we want to enter add functionality that enters into planning mode. So creating plan drafts and then exit plan mode and execute the plan mode. So usually this is used for larger edits, risky changes, multifile work, more complex work. So instead of just uh answering question, reading file, summarizing the repo in plan mode, it's going to inspect likely useful files, name the intended changes, name the verification step, what is it asking the user or is it recommendations? And then user might chat with it and then wait for approval or exit. So maybe user will approve the plan implementation or exit and cancel the plan. So first we will use enum and data class and we will create a class uh agent mode which can be normal and plan mode. So first we will define one step of the plan which is action, target and reason. All of them are strings. Action can be inspect, edit, test or report for example. Target can be file command area the step is about and reason is just the LM reasoning why to do this. And now we will define class for the whole plan. So request is the user's task.

[00:17:15]Summary is one sentence describing the whole plan and steps is a tpple tpple with plain plan steps in the order of execution. So user will be able to review them. So we will have this class agent state that's going to keep track of if the plan mode is currently on or off. So the mode will be normal by default and then active plan will be none or agent plan. So it can have these two types. Now you need to figure out how you're going to enter the planning mode. For example, you may uh just have some toggle that user selects or if they write uh words like refactor, redesign, migrate, plan, etc. I recommend to start with just some toggle uh that user can choose maybe. Now guys, I'm not going to go uh step by step here. I'm not going to explain this. This is if you want to uh make a plan mode.

[00:18:13]So you uh we covered the main steps. Now you may just ask your coding agent to help you or you can read this if you want. But this is not so important. So I'm going to go on to the next lesson.

[00:18:24]But here uh you can like uh read this more if you want to do this. So let's go to uh save file editing. A coding agent becomes dangerous the moment it can write files. So, LLM may request an edit, but our runtime might approve it or reject it and owns the edit. So, we're going to have two uh functions for now. Edit file, replace text inside an existing file, and a write file, which will create or or overwrite a whole file. So for editing the file, we're going to have this class fileedit that has path to the file, old string within the file that needs to be replaced and new string that will replace this all string.

[00:19:12]So replace all maybe there is multiple instances of this string. So maybe we want to replace all of them. Um maybe this is actually like you may use this or you may not use this. you may use different ways. So, uh later we will we can think more about this particular thing but when we see the use cases for example but for now let's leave it like this. So edit file does not say change line 42 it shows the text. So find this particular exact text and replace it with this other text. But if there is a lot of code or a lot of new text then we want to use file write. So this will just uh write the whole file or overwrite the whole file overwrite. So maybe we maybe there is like too much stuff to be written. So then you don't want to actually uh generate this whole string. You don't want to generate it.

[00:20:08]You will just rather generate the whole new file. If there is too much stuff to be written okay then the result should be returned that contains the file path it changed bool true false message this can be message of the AI or summary or whatever you want to put here we'll see later and difference what was changed so we'll we'll see later like what exactly you want to put here also you want to only allow the model to edit files within the root whatever the root is. So if it wants to edit files outside of the repository outside of the root you want to block it. So this is going outside you want to block it. Here you may add like more checks like do you want to uh do you not want to replace some kind of file some kind of files you don't want to like edit. You want to check if old string is not empty or if they are old string and new string are uh different and also if all string is matched more than once then maybe you want to throw an error or uh maybe find a different way to specify this. So just need to think about these uh bad or these edge cases. So then you can have apply replacement function that's going to just use this uh string.replace So it will this is just the Python function that you can apply and then you can build a difference to show to the user like what exactly which exact lines were replaced because maybe uh old text and new text have some overlapping lines that are not replaced. So you have this diff lib diff library you can use that to build a difference and you can actually make show the difference before the editing. So user can apply this edit or reject.

[00:21:58]And so uh we may have a a function that's going to actually ask the user and it's going to show the difference and ask the user for approval.

[00:22:09]You can then just uh add these tools to the runtime. Rest of this lesson is just code snippets on how to add this these tools to the runtime. So this is not so important. We are just like this is just how to code it and how to test it.

[00:22:28]So these are the things that you could do. Uh you may also submit whatever you do post it on school and we will review and help you. Next let's add a runtime guard that sits in front of tool execution. So model requests a tool runtime checks permission to execute this tool and the tool runs or gets blocked. Here is example uh read only only allow model to read uh workspace write and dangerous full access. So workspace write can uh run read write workspace tools so specific tools and danger full access can run a lot more or it may even run uh have complete control over CLI and run anything. You may also have asking user to allow disallow commands. So, but let's start with these three non-interactive modes before we implement asking user. So, all of the tools like read file, write file, edit file, bash. Uh they might have some permissions for example read only or workspace write uh workspace write and bash may need entire full access permission. So if it has full access permission, it automatically has all of these others as well. And then the permissions may be connected to some mode.

[00:23:53]So when you are when it is deciding to allow some command, you don't want to just return true or false, you also want to add uh the tool and the reason for rejecting or approving and active mode, acting permission mode and require permission mode. for example, then you need the enforcer that's going to compare active mode versus what the tool requires. For example, unknown tools, you may put them only in danger full access mode.

[00:24:27]So then you have this check for example.

[00:24:30]So that's going to be it for this lesson. You may check out the rest of the code if you want. So this is what you uh could do. And if you're building this as a portfolio project, uh, publish your piece of code in school as well.

[00:24:44]Now, let's do bash commands. We're going to create this bash tool.py.

[00:24:50]Here we will have bash command input class that has the shell command to run the exact string. optional timeout after which time if it does not return results or finish it will just throw error timeout description and uh if it can run in background or or if if you want it to run background or not and we also want bash output command that will have either some output or some error whatever it returns if it's interrupted true false uh the background task ID if it's running in the background and u maybe the interpretation of the code or something like that. We'll see here.

[00:25:31]So the code interpretation can be uh some of these like timeout taxi code 7 whatever you define. So background commands they don't wait for output. So uh LLM can just start the background command and proceed with the different task. So here we will use this sub subprocess library or module and uh this is how the background task will be defined. So subprocess pop and then pass in the parameters and at the end return our bash command output and the agent should get a task ID immediately as soon as it started so it knows which task to track. You may also uh define a foreground function with some uh seconds for which agent will wait. You also want to return time out as the output as well. If uh the command was taking too long and didn't finish in the timeout time, let's say 10 seconds.

[00:26:33]So uh this timeout will let the agent know that it was that it was just interrupted because it was taking too long for some reason. And then you can have execute bash function that will call either execute in background or foreground depending on this uh parameter here. So this is the whole tool. This is the tool that uh AI agent can use. And here I showed you some like tests. If you want to implement tests, you want to check time out. So this is not so interesting. Uh this is what you can do. And if you're building portfolio, you can just post on school this step. So after the agent uh ran the tool, we need to feed the tool results back into the loop. So model returns tool use, runtime executes the tool, runtime appends tool results, model sees tool results, model decides the next step. So let's first define this conversation loop and conversation steps. So conversation loop.py Pi and we will have text block, tool use block and tool result block.

[00:27:42]So tool use id is uh important. So it can be tracked and tool result dot tool use id is going to connect to this tool use ID. So this is how they can be connected here. So tool use ID connects to this one.

[00:28:02]Without this ID link, the model cannot reliably match results to tool calls. So now let's define different types of messages. So it can be user assistant or tool message. So these three types of messages for now.

[00:28:17]And we also have this content block that can be one of these three classes. Text block, tool use block or tool resolve block. So here we may have some helper functions just to uh create these messages with ro assistant user and then some text or blocks.

[00:28:37]So these are just functions. So you will extract the tool and you will also define a small exeutor tool. So uh the logic for executing the tools is going to be separate. Here we just have tool executor and execute uh class or function. So later the exeutor will route to read file, edit file, write file, bash etc our tools. So execute tool use function will just execute the function and return the output.

[00:29:10]The loop should also not know the model provider. It doesn't matter which LLM um we are using. So now let's build a turn loop. So run a turn.

[00:29:22]It's going to have list of messages, previous messages, user prompt, model, uh, llm, executive tool, and max iterations. We don't want it to run indefinitely.

[00:29:35]So the transcript will contain all of the previous messages. Then this for loop will keep iter iterating. As long as model is uh calling more tools, tools, tools, it's going to keep iterating until model stops calling tools.

[00:29:50]and execute each requested tool and append the result.

[00:29:55]So if the model keeps calling tools forever, you want to throw an error uh based on max iterations. So this is the full behavior here.

[00:30:06]Then we want to uh convert messages for the API. So it may only have assistant or user roles. For example, tool results may go back as user messages.

[00:30:21]But some of the APIs I think GPT has like tool call as well as a role. So that's going to be actually written here. So uh ro assistant if ro is assistant. So if the role is actually tool then else then it will be assigned to user to test the loop. You may just uh tell your coding agent to write some tests like fake models, fake exeutor transcript etc. This is the checklist you may do. And if you're building portfolio, just uh also post this on school as well. We will review it in the next lesson. Here we have testing. But uh this one is kind of too simple or I don't I I don't want to like make video about this one. It's too simple. You can just copy paste or it's easy. So let's go to the next lesson where it's more like the new feature. Let's do uh to-do lists. So when AI has some task, it's going to break it down to to-do list.

[00:31:16]did it's too long. So model can send a full to-do list and then runtime will validate it, save it, process it, etc. Let's create to-dos.py. And here we will have a to-do status that can be pending, in progress or completed. So let's create to-do item. So content is uh the task. So run tests. For example, active form when it's running, you're going to display this string which can be uh running tests and status is to-do status. So, pending, in progress or completed.

[00:31:52]So, content describes the task. Active form describes what is happening while it is active. So, the string you may display. So, this gives UI a nicer live label without making the model rewrite the task test.

[00:32:07]Then there is going to be to-dos. This is just a list of to-do items.

[00:32:14]And let's define class to-do, write output.

[00:32:18]Now, we're going to have old to-dos that are loaded and new to-dos that are sent by the model. This is this already exists in the runtime or is loaded from memory and this is sent by the model.

[00:32:31]And you need olds. So you can change you can show the difference as to what changed. For example, uh previously it was in progress but now it's completed. That's why you need to load old and new tools or to-dos I should say. Then you may have some validation like to-dos must not be empty or content must be empty or active for must not be empty etc. Then you want to store it in your local memory. So the main idea here is to validate the new list that that we got, load the old saved uh list or save uh and then save the new list or empty array if it's all done and then communicate all tools and new tools. So the runtime can say before this call uh to-do state was X after this call the model requested Y. So we are using all the new ones so we can show the difference. So actually I skipped so I skipped a bunch of stuff here like from this pop point all this is just about loading and saving. So this is not so important uh to understand and then we will actually create a tool a tool model can use uh to create to-do lists. So it has name description parameters and we're going to create some Python wrappers like we did with tools previously in this course.

[00:33:59]This is easy to understand. So you may just copy paste this into your uh coding agent to code this because it's easy to understand. Don't be afraid to also ask cloud code or coding agent to explain these things uh if you forgot or you don't understand. So let's go to the next. You may add tests uh to test to-dos if you want. So I'll just quickly scroll. This is not so important. I don't want to spend time. So this is your to-do list. And if you're building portfolio, you can just uh submit this into uh the school create a post and we will review it. Now let's figure out the search. So when your agent is trying to read files, we need to have some way of letting it search like which files might matter, which lines inside of the files they might it might change or it might read and uh which other code maybe nearby files, nearby lines it may want to check. So there are three types of search globe search.

[00:34:58]So this is example here. uh it may like okay let's say it's trying to read to-dos so this uh function for example this argument would search all of the files that have to-do name in the name at any path and that end with py for example you can try to guess this so this may return uh cloud code/todos.py pi and tests slash test_todos.py.

[00:35:29]So you see this has to-dos in the name and this has to do in the name. So it will return these. Grab search would find matching lines within file and read file would uh read some window some piece of text around matching lines. So these are two different we'll see some examples. So globe search uh what files look relevant. So it's trying to guess based on like name or some of your prompt. So it can guess like all of the files that have to do in it. So grab search is where is this symbol or text used and maybe we can read text around it. And uh read file is what is the actual code around that spot. after it maybe finds some files with to-dos. It may also use uh target functions to try to read inside of them.

[00:36:23]So the habit is start broad with file names then narrow to matching lines then read only the useful cont code window.

[00:36:31]So let's build a globe search find files by path pattern and grab search files contents with regax. Let's create a new file search tools.py pi and this is what you can start with. So globe search input is going to have some pattern and path and the pattern is going to be something like this.

[00:36:55]So and let's see globe search output is going to have number of files file names and uh truncated if it's uh maybe if the file file is too long sorry I mean uh when there is too many files returned that's that's what this is going to be like how many if a lot of files return then we will truncate this is example of pattern optional base directory for the path and truncate is true when there were more results than returned. So we return let's say 10 but there is like 50 results. So globe search would be uh used when agent knows the file shape but not the exact file. For example, we want to find tests. So inside of tests folder, we're we're going to just look for all Python um all Python files inside of test folder or find all react component or find anything with to-do in the file name. So globe search does not look inside files. It only answers with paths.

[00:38:02]Globe search can be implemented by your coding agent. So this is some example code. you need to uh define the function and the base path and pattern. So, and then uh collect files with matches and return files and you might limit it to just 100 maximum files for example and then let's do grab search. So, grab search will look inside of files and read the files not just the paths. So this is used when you know symbol word function name class name etc. So in the graph search input we're going to have a bunch of these uh properties.

[00:38:44]So pattern is the most important one. So that's uh reg x to search for and globe optional file filter like what we had previously.

[00:38:57]So output mode can be files with matches. So this will just return the file names or content. This will return the content inside files or parts of content part inside of files or count can just be how many matches it found but not not the text itself.

[00:39:17]And then let's define grap output. So another uh data class. So it's going to have number of files, file names, content, number of lines. So these uh properties then we want to uh collect those files and we want to keep this separate function because the whole workflow of grab search should be like this. So um choose base path compile reg x collect files filter files search contents limit output.

[00:39:52]So this function is going to be very simple because it gives grab search a list of files it is allowed to inspect only a list of files it's going to inspect then later filter filters decide which ones to skip. So in filters we can actually exclude files that don't have certain extension or that don't have some um string in their name etc. So there are more like features and things to add if you want. And I'm just going to scroll quickly.

[00:40:25]So you may check this out if you want to implement this in details or your coding agent may implement it. And you also want to uh set this as tools and give it to the agent as tools as well. Here are some tests you may want to use. And uh this is the checklist you may want to do and submit to school if you're there.

[00:40:49]The next lesson contains more about uh reading and searching. So read file window that was searched. Uh beginning is just uh reminding ourselves. So you may create this file uh read file.py since this is similar to search. I'm not going to teach this lesson. So you can uh read it yourself if you want. But let's go to the next one. Session transcript.

[00:41:13]So agent needs the persistent session.

[00:41:17]So uh it needs to keep track of user message, assistant message and tools. So first we will understand the transcript shape. So there is user message like a prompt assistant may require tool use.

[00:41:29]Then the tool will return some text and then that will be processed by assistant again. So we may create a session.py where we will define some roles system user assistant tool. system is just the system prompt.

[00:41:45]Uh durable instructions, these are always present. And then user assistant tool, those are changeable. Then we will define some content blocks. For example, if it's a text block or tool use block that has ID, name and input and tool result block that connects back to the tool use ID and has tool name, output of the tool and error. Here are some examples like assistant may have message tool use block and then id name and then tool uh message will return tool result block with the information. Then we will define conversation message. So we will just wrap these blocks into messages like text block uh tool use block and tool resolve block. So a conversation message will have ro and blocks. Ro is the message ro and blocks is the list of content blocks. So there is just user text message assistant message or uh tool result message. Then let's create a session. So session ID created at updated messages and path. So this is a new session for the user. This is the like a chat, one chat.

[00:43:07]And you also want to convert the blocks into JSON because it's good to keep data in JSON format. It's easier to send around and to read and to manage. And you also want to convert messages to JSON. So this is what uh it would look like like RO and then blocks may have uh type, id, name etc. These are the messages and then we want to store them in JSON L because we want to be able to append new lines without rewriting the whole JSON.

[00:43:42]So this is what the file may look like.

[00:43:45]So uh type message and then the message content is has ro assistant and then blocks may have maybe text and the text is high. So it may just keep getting appended like this. So this is user and then this is so user send hello and then assistant send hi. So it just keeps getting appended the messages the blocks. Then you're going to add some additional helper functions and functionality like save a full snapshot save uh append messages sorry here I have the code load blocks from JSON read load messages from JSON load a session from JSON L. Yeah, all of this of course it may also be connected to the turn loop in uh using these functions and interfaces.

[00:44:33]Here I have some tests if you are interested and this is the to-do list and what you can submit if you're building portfolio submit this into school and we will help you. Now let's do compact context. So more transcript more memory you know that cloud code and codeex have this functionality slash compact it will summarize the context.

[00:44:57]So let's see how we want to implement this.

[00:45:00]So the old transcript before uh compacting it's going to have many old messages and recent messages and the compacted transcript will have system summary. So it will have summary plus recent messages. So these are untouched.

[00:45:17]So the first step is to summary summarize the the old work the old messages that may not be so relevant even anymore and then keep the most recent messages as is with same text.

[00:45:30]Don't change because recent messages contain the current uh work being done and a lot useful context.

[00:45:38]So this is example uh conversation may be like user starting with build bash tool and then assistant wrote the design test pass and so on and then we have like two last messages. So those two last messages will be kept same but the system message will be the summary of all of these previous messages. So let's create a file compact.py Pi and here we will create some data class preserve recent messages four and max x estimated tokens 10,000.

[00:46:15]So max estimated tokens when older uh compactable messages are too large compact. So this will autoco compact. So after this threshold is crossed in the tokens it will uh trigger compacting automatically. So actually you don't need like a perfect perfect token count at all. You can just count characters and divide by four and this is a rough estimate of the number of tokens.

[00:46:42]So these are the estimates you can do them in this way. This is how it's done in cloud code. You can estimate whole session in this way as well. Then the logic for deciding if this is uh enough length to compact and the logic for splitting the old and the new messages that are not going to be compacted. Also don't split tool result with tool uh from tool use. So if the first broke if the if the first message that you are keeping as is the result you need to also include the previous message tool use. So make sure that they are both included if one is included. So after that in that in this file there is just a bunch of code that you can check yourself. There is some uh you can just copy this and this will be done. So now you understand what's happening. So this is the um checklist and school submission. Now let's connect it all into end to end agent runtime. So right now we have all of the parts which is session, transcript, tools, permissions, files, bash, search tools to do testing, compaction.

[00:47:50]Uh now we need to kind of put it all together. So user sends a prompt. We need to append to a session, ask them, send the model this prompt, append assistant message, execute requested tool, append tool result, repeat until final answer, uh compact if needed. So we're going to create runtime.py PI and we will coordinate four things here session which is the transcript the memory uh the LLM tool exeutor and permissions.

[00:48:18]So runtime should not contain all uh tool code directly. It should call smaller modules you already built. Then we will actually uh make these events and messages and tool calls into actual events. So we may have like um tool use event or message stop or uh text delta. So so assistant event might be one of these three classes.

[00:48:47]So why events? Because text can stream in chunks. Tool input can arrive as structured block.

[00:48:55]Message stop can tell the runtime to that uh assistant message is complete.

[00:48:59]So the model client which is the client that calls the LLM it's going to take the current transcript and return assistant events that we just defined.

[00:49:10]So the runtime is going to send system prompt and session messages all the transcript and model returns assistant uh text tool use events and then we want to build an assistant message from events. So we will import the session block types which is conversation message message ro text block tool use block. Then we want to collect the streamed text. You know how chat GPT is writing text. It's not sending the whole response at once. It's writing the text.

[00:49:39]So we want to collect that until we get a stop sign or until we get the tool use. Then we have a few handlers here that's going to handle the text that tool use.

[00:49:53]uh just convert and compile.

[00:49:56]Then we want to extract pending tool uses.

[00:50:00]So let's say it sends if it sends a stop sign and there is no tool then the turn is done. But if there is some tool use we need to execute uh the tool and send the result to the model again. So we already have the tool execution. So we may import it and then we need to send extract the tool name input and all of this stuff. This will be uh all returned by the model. We got some permission checks.

[00:50:32]You actually may uh have a summary by the model after each turn. So you know when cloud code is going going and at the end it may just summarize everything it did in that turn. So you may actually implement that as well. Now we will just define create the runtime object which is going to be this class and it has session model tools permissions system prop max iterations and this is this is the final thing that combines all of it and this function that starts the turn uh runs the turn.

[00:51:08]The model request should include the user prompt. The session should persist what the user asked.

[00:51:15]So a user turn will begin by user appending or by appending the user message. It's going to be like this.

[00:51:21]Append message runtime session and user text user prompt. And then we make define uh assistant messages. It's an empty list in the beginning and tool results. It's also an empty list in the beginning. And then we will create a loop that's going to keep looping uh calling the model sending new messages, new tool results, etc. and we need to append all of the messages that are going on. So this is where the uh cloud code is doing its uh iterative like solving the task coding etc. So at this point the transcript contains user message and assistant message. And if there were tool calls uh those are contained inside of this assistant message because this is iterating for some number of iterations like maximum eight iterations then it's going to stop or you know codex is doing like something for an hour maybe you want to like uh put like thousand max iterations or something. So after this uh cloud code is done working working working then this message output is appended and at the end you have just user message and assistant message and assistant message contains all of the information contains all of the information that was happening here in these iterations and then if no tools were requested we need to just stop. So uh this is like uh the cloud code did what it's supposed to do and we just need to summarize the turn at the end and uh show it. There is also this step to execute requested tools.

[00:52:56]Uh we might like do some checks and stuff if the tools are allowed and then continue after the tool results. So this is how the loop goes. You also want to have a tool registry because remember right now we are combining everything.

[00:53:13]So we just register all of the tools. A runtime factory uh create one helper that assembles the runtime. So it's just a build runtime function that's going to return uh agent runtime with session model tools etc. And here we have test runtime. So you can copy this code. You can apply this. This is just to test it.

[00:53:35]So this is what we have by now. We have session that remembers the conversation model client that talks to the uh LLM tool exeutor permission policy and agent runtime that coordinates the loop. So this is the pseudo code how it works. So append user message and then loop. So uh wait for the model response append assistant message after it's done. If no tools finish for each tool, check permissions, execute or deny, append to tool results. So we're going to keep looping here.

[00:54:10]Okay, keep looping until this there is no tool calls. So when we do this, we go another loop and then maybe compact.

[00:54:19]And this is the foundation and everything else is improvement. better model provider, better tools, schemas, better UI, better permission prompts, background agents, sub agents, etc. So this is the checklist that you can uh check yourself. You can do this and submit this to school and something from this checklist so we can help you. So let's add a CLI tool that can run our agent and runtime. The CLI tool is going to say which conversation am I continuing? What is the where is the conversation saved and is this session allowed to run this in this workspace CLI tool is going to read command line arguments create or resume a session build the runtime send the prompt print and save updated session so runtime should not parse command line arguments a CLI should not manually run tools and we want to store sessions into a managed folder so let's create session store.py Pi. So we want to store it in a hidden folder like dot claw because this is not the source code. It's just used temporarily or by the user. And we also want to have a workspace fingerprint. So if same if two sessions are called session alpha for example uh this fingerprint will distinguish them. So uh this may be example it may be like some fingerprint and then session ABC.json JSON L and we want to store in JSON L because it it's going to store one JSON object per line. So line one may store session metadata. Line two user message, assistant message, etc. So the CLI can just append without rewriting one giant JSON document.

[00:56:06]Let's define a session handle. So uh CLI does not need to load whole session all of the messages just a single handle that says session ID and path. So this is how it can identify the session and work with the session.

[00:56:21]So ID is the human friendly session name and path.

[00:56:26]For example, session ABC 1 2 3 and this is the whole path. This tiny object is useful because the CLI can pass it around before loading the whole conversation. So we will create the session by generating some uh fingerprint and creating such folder.

[00:56:44]So uh we can just create this folder and create the session without writing any transcript and we also want to bind the session to a workspace.

[00:56:53]Imagine the session is created in work/ payment API but user accidentally runs it when they are seated into mobile app.

[00:57:02]That's why we need to include the workspace root as well into the JSON L and add that as the first line. Then we may add some more commands like uh resume with the exact path or session ID or some alias.

[00:57:22]So this is the code.

[00:57:25]Then you may create cli.py that's going to load or create session.

[00:57:31]So if there is resume then it will resume previous session and if there is no resume tag it will just create a new one.

[00:57:40]You want to only print the final assistant text which is usually the summary. So if assistant is sending messages like I will inspect this file and then tool views etc. Uh you want to just find the final text or summary and print that in the your CLI. Here is some entry point with argument parsing and main.

[00:58:02]So this is the order. Choose workspace uh load create session. Build runtime with that session. Run one term. Save session. Print final answer. Here are the tests. And this is a summary.

[00:58:19]So you can uh post what you built in this lesson in school post as well.

[00:58:24]That's going to be end of week 15. So that concludes this course. There are weeks 16 and 17 as well in the GitHub repository. So you may check them. But I just want you when you are building this, I want you not to get overwhelmed.

[00:58:42]Don't get bogged down into too many details. You can just read these lessons step by step uh start to finish. You just need to understand what's going on.

[00:58:52]And most of you will be building this for portfolio. So if you just build what I explained here and you understand what's happening in there, uh maybe when you are at an interview, they will ask you to explain what you build, how you build it, why you build this, why you make these decisions. So make sure to understand those things. So you may post what you've built on school and we will review, we will help you out, give you more advice and there are more tasks coming on. For example, here the CUDA course, you have some tasks to do if you want to learn build portfolio for uh GPU kernels from scratch.

[00:59:31]So if you go ahead in school and post these, we're going to review and help you out. On my channel, I'm going to keep posting these uh long courses. So last one is the GPU programming full course from scratch. So you may subscribe. I'm going to change to these now. And uh thank you for watching. Check out my YouTube. Check out the school community if you want to become AI researcher, learn GPU programming or uh AI assisted software engineering etc. So see you in the community and see you in the next video. Goodbye.

#ai research #learn ai research #learn ai engineering #pytorch tutorial #learn pytorch

Ähnliche Videos

Agentforce NOW AMA: Build with React and Salesforce Multi-Framework

SalesforceDevs

490 views•2026-05-28

How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust

aiDotEngineer

450 views•2026-05-28

WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅

LearnwithSahera

1K views•2026-05-29

More tests are always better? How to use AI to identify tests that bring little value

Alliance4Qualification

335 views•2026-05-29

Search Algorithms Explained in 60 Seconds! 🤖💨

samarthtuliofficial

218 views•2026-06-01

People of Game of Thrones using JavaScript DOM

AltCampus

296 views•2026-05-30

Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA

ascensionix

107 views•2026-05-29

So What's Odin Lang Even Good For

TechOverTea

131 views•2026-06-01

Trends

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30

The Fastest Way To Board A Plane 😮

zackdfilms

6504K views•2026-05-29

Künstliche Intelligenz

DOOM Runs On Everything...except Neo Geo

ModernVintageGamer

143K views•2026-06-01