Effective AI agent development requires structured approaches with proper context management, where agents should be designed with clear specifications, testing, and acceptance criteria rather than relying solely on rapid code generation; this ensures sustainable development as the benefits compound over time, unlike traditional coding which may feel fast initially but slows down without reflection.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
I documented Nepal biggest GDG Build with AI Event 2026๐งโ๐ปAdded:
Hi you all. Good morning. It is almost 9:50 a.m.
Hi you all. Good morning. It is almost 9:50 a.m. now. Now I'll get ready to go to build with AI Nepal 2026 GD event and the venue is in. It's been a long while since I've not been to any tech events.
I'm so excited. Let's get ready. I'm planning to wear this top and this jeans. I'll be right back. So this is how it look like and it is fitting me perfectly. I've not done any skin care except moisturizer. So I'm here quickly applying sunscreen and in makeup I don't do anything as such. So now I'm just applying sunscreen and lotion in my body and I'll just apply lip gloss almost 40 to 45 minutes.
So I got that quickly and from here the event content actually starts.
Please share this video and also comment down in the comment section you want to see it. So then I'll be sharing in the next video if you guys are interested.
Moving on to this. Let's do a emotional check.
As you can see, there are 10 emotions and it's all about cats.
What would you say your emotion is?
>> Excited.
>> Excited.
>> Most our tech lead and give platforms for students, professionals and anyone who is looking into tech. So Android are extended.
>> You know I thought why not I could do both of these things. So allow me to introduce myself. I am Omar Mazir and I am the community manager for Google looking after Pakistan and South Asia frontier region which includes Bangladesh, Nepal and Sri Lanka. And uh you know when I first got the message to keep on thinking about what to talk about but uh just know this that with Google IO and all the exciting gifts uh like updates that you guys have seen we are currently living in a wonderful time. I mean this is a great time for you guys. No matter who you are, whether you're a developer or already think that you're on a pro level with AI or if you're somebody that's just starting out, all of the tools that you have available at your disposal can take you literally anywhere. And right now, the only limitations that you guys have is your own imagination. So in today's build with AI I would request all of you to you know keep an open mind follow through and just come with the mindset to grow because when you build more that is when you know what is actually possible and what's actually out there.
So my advice to all people who are just starting out their journey with AI, I mean if you take my example, I am actually a non-technical person who experimented a lot with AI to basically uh ended up finding a way to get a job at Google. Right? So these are some of the possibilities that you guys have available to you guys and uh feel free to explore feel free to reach out to me on LinkedIn and uh have a great great workshop today. Thank you so much.
>> Anise mana he is the CTO of like working with AI agents and I imagine that almost all of all of us use like um AI user authentication system. Basically, it is uh six words. You know, the thing about VIP coding is that when you're starting, you're starting at a very very fast speed because um every iteration you're like adding huge chunks of code and you get this feeling that you're developing or like you're progressing at a very very fast rate. But uh if you take take a step back and then do some reflection like they're not learning that much you know you're just generating code based on like what you feel and then based on some like uh prompts uh that you really don't have a deeper idea about like what exactly it it'll be doing you know like 100,000 decisions. So if you give like two or three or four decisions or like inputs every every step then obviously you know the number of iterations will keep on growing. So enter as coding.
Uh so basically here you'll be using AI as a very smart uh colleague very no proper testing or acceptance criteria and in the last two lines um well you know as as I was saying in the beginning by coding feels very fast at the beginning uh but as as a time compounds uh it it slows very very badly whereas it's just the opposite in the case of agent decoding where they start might slow because you have to decide a lot of things but uh the base compounds as time goes on.
Okay. Uh moving on to the next section is about the context. Um so on the left we have the bad approach and on the right we have the good approach. So on the bad approach what it's what is being done is that it is using one session for everything. Like for example if you look at there so everything is inside one session but it is doing authentication billing deployment and reporting. So what happens because of that? So the context ends up looting and uh the answers start to drift which is basically the model starts to um hallucinate and information can you pack up in one context. So basically what what it means and then although like when when those AI model um providers uh they like they say that you know tokens like they're not exactly used for this because as you can see on the slides as well some part of it is taken by default sessions very long. So long sessions they just don't cost more they get worse and worse. uh important sessions or like important uh information might be lost as well like they're completely uh stateless in the sense that so let's say that when I am having a conversation with like let's say a Google chat or cloud or um open AI so what is happening every time that I'm sending a prompt to them is that that prompt is like copying all the prior information to that chat and then sending it right so what that happens is that you know like let's say that if I um send a uh query of length one for the first time and then it sends back the reply and then again like I send another query of length one then what it happens is that you know like that length one is added to the previous query as well so that that becomes 1 + 2 and that's the reason you know why why the formula is n + 1 by 2 so basically it is um arithmetic sequence you know so 1 + 3 + da da da so that's the formula of nn + 1 by 2. So that like these are the numbers of you know like what the um the tokens would be like if it were linear but uh that is like most of the like most of the time that is not the case. So what like what actually happens is that the tokens they balloon and they end up becoming let's say from from 2k they become like 11 uh 11k.
Um and then this slide is about the habits that you can have uh in order to keep your context clean and cheap. So the first one is one feature per session and you know so every time that you have to um we're talking about the different project instruction files uh when working in agent development the first one is uh about the agents MDS they are stateless but the project is not uh saying mostly about the architecture conventions etc and the rules MD you'll be talking mostly of the restrictions due to governance and the skill set MD uh you'll be so this is the part that is um loaded just in time and then generally like one skill is one job so basically it answers the questions of what and then agents answer the question of who right and skills are loaded so all skills like they have this name and description like that is always loaded um and then it costs a very very small token cost and then below that like it has the body of the uh um sorry of the skill where um it is um like where the workflow pitfalls are defined and then the level three is about the scripts and assets. So like they are loaded only when the when that particular skill is referenced.
Okay. So this is um what normally is inside a skills folder. So here as I was saying you know the skills set M it has his name and description and that is a typical layout of a um AI agent project in which uh right uh below the project there will be this uh um agents MD in which you'll be having the general information the project architecture the coding conditions etc and then you'll be having the rules of MD in which you'll be telling the agent like what not to do and then there if you see like uh inside the dot agents like it it will have different agents that uh that you'll be using and then in the same folder parallel skills like you'll be having the skills folder in which you'll be uh listing up like the skills that you're using in the project.
So here are the uh eight tools uh that will help when you're writing instructions. So I'll just um go through them. Um so the first one is about keeping it concise uh in which um generally it's considered a good um habit conventions and then then you'll also uh will be like writing okay so this is the comparison table of V coding and as decoding yeah so if you look at the process you know V coding is yeah just uh do a prompt and then either accept it or tweak it in case of decoding it's more of a structured process in like you first design and define the specifications then you design then you actually do the task and build and validate the model itself. I've I've used the Gemini 3.5 flash of the uh which is provided by the open router. So let's see right. So I'll delete this create the Python file that I just deleted and then using that Python file it should create that um HTML and then dot DHT files which are the final outputs. See like here so it is already talking see okay first okay so it is like writing and then building the Python file right so if you look at here so this is the Python file that was built uh right now and then what it says is that like it has successfully given the Python file and then so it has given a summary of you know what what has been done so it has written this Python file and Then it has passed the uh input mark file to this into uh different nodes and then after that it has generated the resume.html file and then after that it has also generated the resume.txt file.
So here so these are the rum.html and then resume.txt file that I deleted a while ago and then so these uh so for the um AI tool I'm using u code. So this is something you know I think that um you can choose depending on what your necessity is or like what your interest is like there there are people who use um open code who use entarity who use code who use cloud. So I think that is totally up to you about you know what uh what tool you want to use here. So, so this is and then one other information is that you know here you can also see like um how many tokens have been used here and then how many is a total token uh like in the context window like for example in this case like 1 million is a total context window and then like currently only 43.2k have been used. So this is also a very good way how you can say you know like how efficiently you're using uh your props right. So here if you look at the resumehtml I'll just uh take you see like this here. So this is the HTML that was um created by the build.py file that was created based on what were mentioned in the agents and in the skills and apart from that like this is also the u the uh another uh file the txt file that was also one of the outputs. So yeah, this is a very uh simple example but you know like I wanted to make it as simple as possible so that you guys could focus only on what's needed and then you know not get diverted by other codes. Uh lastly um I just wanted to mention on the things to remember uh and this Gemini enterprise platform one of the product they have launched last month only.
Anyone can answer what is an AI normal case and what is an agent?
Suppose I'm here in Kathmandu. I don't know anything about the Kathmandu and I want to visit the various famous places in Kathmandu. Recommendation is a different thing. Agent is a different thing. Recumentation we can get the informations like some sort of information we are looking for. But like what is a normal agent in day-to-day basis? Suppose you wanted to go traveling, travel planning, you wanted to buy a property. So what's the agent basically do for you?
>> Simulates the >> share market everything is working on the basis of the historical set of data.
So predictive using the various machine learning algorithm algorithm is mean like al classification, clustering, indexing. this feature they are going to embedded with that machine learning and we are creating some sort of the AI solutions over there now so generative is very evolved in 2017 like charge to reduce that charge so what's the major differentiation in the generative way compared to the predictive way >> very nice audience so >> what's the differences like have Have you observed have you think about each and every time you asking the same question but each and every time you're getting the different response have you seen?
>> So that is the beauty of the generative AI. So when that generative way launched that is the one dimension like you want we wanted to convert text to speech speech to text to image image to video this technology do but technologies move on evolve then that technology come with the multimodality generative it means you are mixing the different type of the objects objects what you can text with the image speech with the video something you can mixing different stuff over there and getting the output over there so whatever the prompt we are getting asking question and answer you are getting that process is not the interference. Have you know interference the meaning of interference and whatever the prompt you are putting over there the every cost is going to be calculated on the basis of the token. So in LLM world lm is like lab language model like charge gemini every cost is calculated basis of the token and token values can be different as per provider like clouded having the different way of the calculating the token openi having the different one gemini we having the different one yeah so this talk is more focused on the Gemini product of course Google Gemini is by Google platform and I'm also going to involve time to time how the new things they are coming up in the last month. So we can build the end to end multi- aentic auto system for you guys. Yeah. How much powerful API is there? So API is something like how the different software component communicate with each other or maybe the exposing with each each other. AI is one of the way through which we can or straight we can you know perform the task on our behalf. That task supposed to perform by itself. But that task is going to perform by by an agent. Anyone locally tested build local agent for you for your convenience for your use cases?
You say which one?
No one is built any for you use cases agent like fitness tracker, assignment tracker. There are so many problem statement in this world.
No one is built anyone.
>> Okay. So try to be build it around the I'm not going to be asked only build it around the Gemini only but whatever the models were very much comfortable try to be build it like you know every profile nowadays you are working. So you have to learn the AI by any hook and group and you have and you have to implement the AI technology within your profile either you are a developer you are a deops you are s you are for any background from over even for manager like CTO's CEOs they are also going to be understand what's the different use cases related with the genetic nowadays cluster so kubernetes cluster like many companies nowadays whatever the AI platform they are going to be built they are using in the Kubernetes platform.
Okay. So this is the foundations layer is nothing but but foundation like the normal use cases we are not going in detail of identical foundation something like to build any models that is more focus on the region understanding like like earlier slides is more focus on the foundation like normal use cases but if you're talking about the multi- aentic collaborations so these are the few examples like you know if you wanted to build any agent related with the medical requirement is traditional leg is more more proficient if you are talking or handling a use cases. So how we can use the Gemini enterprise platform to directly launch the product. So how suppose if any user using your agent platform how we can authenticate how you can verify and authorize that user particularly. So these are the few parameters they have to think about. Okay. So so this is very important as well. So suppose you build a model you build the agent but how you can calculate the cost that you know how the cost is going like every agent is have including the very high cost of the amount in form of the tokens. So if you want to analyze the token if you want to understand the latency so observability is one of the platform through which we can understand how the token flow is going to happen how the latency is going to figure out. So one suppose one request you are hitting any particular Amazon.com site and site is not working maybe you are giving the response after 2 minutes. So in the background that is not only hitting one of the server migr can be heated multiple hop of the servers. So where is the basically the latency happening so that can be issue can easily figure out through the observability platform you know like the take the real examples of suppose any patient lying on the bed. So on the monitor you can see that the high blood pressure is happening. So but we don't we don't know what's the root cause why it is happening. So through which doctor perform various test lab or maybe the test cases then finally find the root cause. The similar way if anything happening in your system how we can find the root cause that is can be easily figure out through the terminology known as the observability. Yeah. So this is the supererset of the monitoring set and so this slide is more focused on the latency token utilizations agent accuracy agent like know whatever the agent you have built are you actually getting the response what are expecting over there. So that is known as the accuracy form. Then finally the human override frequency like suppose usually for testing of training models what usually end user do they trying with the various use cases you know they putting the one question but they wanted to put top of another questions to understand the model actually so the agent whatever the generally working on then finally cost for execution so that is very important nowadays many companies either you are a pro version user of the LM like Gemini And they have put the restriction over there. Same thing with the clouded they're putting the restrictions over there. So think about whatever the prompt we are putting with the any models it should be very smart.
Smart is like you should be very smart to what the things you are going to be asked. Then we are getting with the normal set of the tokens over there you know.
So this is already covered. So this is just like a flow. So this is the highle design for any enterprise they have to adopt it already and build the agent related with the fraud fraud detections.
One of the services in Google cloud available for mobile coding where you can put your questions.
whenever I got opportunity and time.
Okay.
30 minutes and Two season is done.
easy to follow.
So first part perception, second part, third, fourth principle.
Sorry, I maybe pre Input email price 25% cheaper than 3.1 which is nice model but six times more expensive than 3.1 okay fine model operating operating system entirely 93.6 6 billion less than $100.
So fast for the fastest seems like aspect.
completely.
Okay. Half half off half. Sorryenity last worldwide.
Basically, basically pre-training basically next to simple example.
Syntax pattern style realificences accuracy.
Happy birthday.
Happy birthday.
Best model. Next structure training. Okay.
Learned Instruction following miscalibration overident model hallucination.
I don't know.
I don't know.
Middle optimized for confidence not calibrated for accuracy. We build the start by design.
Third stage transformer.
That's a lot of mechanisms. Self attention.
refresh again cause and effect.
So basically web text code document common task information overident.
So pattern matching not ground understand.
So basically so basically thinking fourth stage Obviously thinking let's say fivecess reasonimize Okay.
Interesting studies.
July 202.
So this personity 19% 19% got productivity July 202% increase measurement still unreliable.
So task completion rate 25% 98% more we are not bad at using it, right? We are bad at asking skeptical of current outcome.
So 2018.
Okay. So influence Pizza customer service.
So His prediction was that creative and compassionate work would be the last creative compass right half right and surprisingly wrong.
He was right about compassion.
statement designstage.
deliver basically orchestration nonreable situation relation So as a human data entry, summary, first world Okay.
Okay. You can click videos.
Let's give him a round of applause.
As it works, no one calls it AI anymore.
AI development got extremely tired and I found out certain things which I didn't build system.
It was so excited. First one was to connect human with AI and Gemini 3.5 and some extent I was a bit shy and it was extremely engaging. Now I'll just I'll just change my bed and do some remaining work and I'll sleep.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 viewsโข2026-05-29
Long-Running Agents โ Build an Agent That Never Forgets with Google ADK
suryakunju
142 viewsโข2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K viewsโข2026-05-28
BREAKING: Microsoftโs New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 viewsโข2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 viewsโข2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K viewsโข2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 viewsโข2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 viewsโข2026-05-30











