AI agents differ from chatbots by combining an LLM with tools, memory, and goals, operating through an observe-think-act loop; effective agent prompts require a structured 'prompt contract' with four sections (goal, constraints, format, failure) to ensure reliable task completion, while memory files enable agents to learn from past mistakes and improve across sessions.
Deep Dive
Voraussetzung
- Keine Daten verfügbar.
Nächste Schritte
- Keine Daten verfügbar.
Deep Dive
AI Agents Explained: How to Create and Use AI Agents in 2026Hinzugefügt:
Same Claude, same 20 bucks a month. Two completely different tools, a chatbot and an AI agent. One answers questions inside a chat window, the other opens your browser, runs 10 tasks at once, and finishes a week of work before lunch.
Most people are paying for the second one and using it like the first. And that's the most expensive mistake in AI right now. Let me show you exactly how to flip that switch. Here's a folder on my desktop, 80-something files, PDFs, screenshots, receipts, invoices, contracts, random downloads from the last 3 months, total mess. Looks like everybody's downloads folder. One command, plain English. I ask the agent to read every file, sort everything into categorized folders, rename it by date and vendor, and build me one spreadsheet with every expense for tax season. Hit enter. Now, watch. A few minutes later, eight clean folders, every file renamed to the same format, and one spreadsheet with every expense from the last 3 months, date, vendor, amount, category, ready to hand to my accountant. A chatbot can tell you how to do this. An agent just does it. Now, the prompt I typed wasn't lucky. There's a specific structure that makes agents actually follow through instead of going off the rails. We'll get to it later in the video, and once you see it, you'll never write a prompt the old way again. But first, how this actually works. Quick foundation, because if you skip this, nothing else in the video makes sense. A chatbot and an agent can run on the exact same model. The brain is identical. The difference is what's wired around the brain. Picture a world-class chef sitting in an empty room, same training, same palate, [music] same instincts as the chef running a Michelin kitchen across town. But with no stove, no ingredients, and no order ticket telling them what to make, they can't actually cook a thing. The skill didn't change. Everything around the skill did. That's exactly the difference between a chatbot and an agent. A chatbot is the brain alone. You type, it answers. Conversation ends. It can't open a file, run a command, browse a website, or remember what you taught it last week. It lives inside the chat window. Useful, but boxed in. Like a brilliant mind with no hands and no notebook. An agent is the same brain with four things bolted on. The LLM, the reasoning engine. Tools, terminal, browser, file system APIs, the hands.
Memory, files the agent reads at the start of every session, so it doesn't show up empty-handed each time. The notebook. And goals, not vague wishes, but specific outcomes with a clear definition of done. The destination. Tie those four together with a loop and you have an agent. The loop has three steps.
Observe. The agent looks at the current state of the world. What files exist, what the page shows, what the last command returned. Think. It decides what to do next based on what's actually true right now, not what you said 5 minutes ago. Act. It uses one of its tools to change something. Then it repeats.
Observe, think, act, checking against the goal until done. That's the entire engine. Every agent on every platform runs this same loop. So that's the whole picture. LLM, tools, memory, goals, loop. There are dozens of agent platforms out there. Most are noise.
After a year of testing, four earned a permanent spot in my workflow. And they each win at something different. Quick tour of all four. Exactly how to install each one, and what the screen actually looks like the first time you open it.
Claude Code is Anthropic's official agent. Runs as a desktop app on your machine, Mac, Windows, or even Windows ARM 64. The part most people get wrong is pricing. Free Claude.ai does not include Claude Code. You need a paid Anthropic plan, the Pro plan, $17 a month on annual, or 20 if you pay monthly. Same subscription covers the chat at Claude.ai and Claude Code. To install, Google Claude Code desktop download, you'll land on a page with downloads for Mac OS, Windows, and Windows ARM 64. Click the one for your machine, drag Claude Code into your applications folder, open it, sign in with Google or email. Done. First run, you'll see a chat page. Click the code button, choose a folder to work in, and you're ready. Type your task in plain English. Spin up a simple to-do list app. Add, check off, delete, nothing fancy. One setting to know about, bypass permissions. It sounds scary, just lets the agent act independently without asking before each step. From there, the interface is what you'd expect. A message box at the bottom, a thinking panel that shows you what the model is doing in real time, and tool calls as it edits files or runs commands. You can also queue a follow-up message, like open it, and it'll pick that up the moment the current task finishes. And one thing worth saying out loud before we move on, Claude Code is not just a coding tool. The name throws people off.
Remember that folder cleanup demo at the start of the video? That was Claude Code. Same agent handles video editing tasks, batch renaming, parsing PDFs, pulling data out of screenshots, anything you can describe on your computer in plain English. The ceiling is your imagination, not the tool. Treat Claude Code as a general-purpose agent that happens to be great at code, and you'll start handing it work you'd never think to give a chatbot. Where Claude Code wins, interpretable reasoning. You can literally watch the model think step-by-step and steer it mid-flight.
Pause it, redirect it, or hand it new context that makes it the strongest of the four when you're orchestrating something complex or chaining agents together. [music] Anything where you want a thinking partner, not a missile, Claude Code. Quick break before the next platform. Claude code builds the thing.
Brevo is the part most builders skip, keeping the list you've already got warm. Brevo is an all-in-one email, SMS, automation, and CRM platform. I use Aura AI daily, their built-in assistant.
Watch, 3 seconds and here's what comes back. Subject line, preview, body in my tone, CTA button. The segment picked itself, 347 people who went quiet last month. Tuesday, 10:00 a.m. when this list actually opens email. One edit, hit send. 21% open rate, 6.2 click-through on a re-engagement list. Openers auto flow into a follow-up I built once. Free plan, 300 cents a day. Pricing scales by sends, not contact count. If you need enterprise-scale lifecycle marketing, this isn't HubSpot, and it isn't priced like it either. Use this code for 50% off starter and standard plans for 3 months, new paying customers only. Link in the description. All right, back to the agents. Next up is Codex, OpenAI's official agent, and the easiest way in for anyone who already lives inside ChatGPT. First, if you don't have an OpenAI account, head to openai.com, click try ChatGPT in the top right corner, and continue with Google, phone, or email. Quick onboarding, you're in.
But ChatGPT alone is just a chatbot. To get the agent, you need Codex specifically. To install, Google OpenAI Codex, you'll land on a page with a download button. The site auto detects your OS. On Mac, it offers a macOS build. On Windows, you get the Windows installer, available since early 2024, and there's a Linux version, too. Click download, drag Codex into your applications folder on Mac, same as any other app, and open it. It's bundled with a ChatGPT Plus, Pro, Business, and Enterprise plans. So, if you already pay $20 a month for ChatGPT plus, Codex is already covered. Same login, no extra [music] subscription. First run, the app opens to a clean workspace. In the middle, you create a new folder for your project. Call it whatever you want. Open inside it. Now you can type your task.
Spin up simple to-do list app, add, check off, delete, nothing fancy. Codex starts thinking, looks through your workspace, drafts files, and writes them out. You can queue a follow-up message like, "Open it." And either send it immediately with shift or wait. [music] The agent picks it up as soon as the current step finishes. Visually, you see thinking, tool calls, and file writes as they happen. Same general feel as Claude code, slightly different flavor. Codex also has one trick the others don't, a cloud version at chat.openai.com. Hand off a long-running task to a sandbox in the cloud, walk away, come back to a finished branch you can review. Useful when you don't want to keep your laptop running. Where Codex wins, friction. If you already pay for chat GPT, Codex is already half installed. Same account, no new subscription, no new tools to learn.
The ID extension also drops it straight into VS Code or Cursor as a sidebar with a built-in diff view. So, every change the agent proposes, you can see file by file before it lands. For someone who doesn't want to think about pricing, plans, or auth, this is the lowest friction way to get an agent running today. Open Claw is the wild dirt, and honestly the most fun one to play with.
It's open source, you self-host it on your own machine, and it was built by Peter Steinberger, the founder of PSPDFKit, as the personal AI agent he wanted for himself. It blew up fast, over 100,000 stars on GitHub in its first week, one of the fastest growing AI repos of the year. The killer feature is where it lives, not in a browser, not in a terminal, but inside your messengers, Telegram, WhatsApp, iMessage, Discord, Signal, Slack, Microsoft Teams, over 20 messaging apps work with it. Text the bot from a coffee shop, the agent does the work on your computer at home, and texts you back when it's done. Installation, [music] easier than people think. One line of code. Step one, go to claw.bot, scroll down to quick start, copy the install command. Step two, open terminal. On Mac, hit command space, type terminal, press enter. On window, search for command prompt or power shell. On Linux, open the terminal. Step three, paste the command, hit enter. Done. That single line installs open claw and kicks off an onboarding wizard that walks through the rest. The first screen says, "I understand this is very powerful and very risky." Confirm, and the wizard [music] takes over. The wizard walks through five quick choices. Onboarding mode, pick quick start. AI provider, Anthropic for Claude, OpenAI for GPT, MiniMax for budget. Not permanent, you can switch later. API key, paste in your key from the provider. If you don't have one, pause. Go to platform.openai.com or the Anthropic console, generate a key, paste it back. Default model, newer models cost more per call, but give better answers. Skills, pick three to five core integrations from the marketplace. Apple notes, notion, Things 3, PowerPoint, Google Docs, whatever you actually use. Don't go crazy, you can add more later. Messenger, Telegram [music] is the popular pick because the app is clean and you can dedicate it just to your bot. Paste your Telegram bot token, connect, you're done. Five to 10 minutes start to finish. The interface, that's the hall trick. There isn't a clunky agent dashboard. There's a Telegram bot if you want it, but the way I actually use Open Claw is the web dashboard. Clean chat interface, full thread in front of me. I can see what the agent is doing in real time. Open it, type your task. Summarize my unread emails from today, build me a Kanban board for my Q2 launch, respond to this email asking to push the meeting to Thursday. Hit send. The agent picks up the task on your computer, does the work, and the reply lands back in the same thread a few seconds later. Where OpenClaw wins, life automation, not code, real-life stuff. Email triage, reminders, document creation, project management, knowledge organization. Text it a video idea, it files it. Text it a tweet draft, it stores it. Text it a research note, it categorizes it.
Shopping and daily morning briefs. With full computer control and skill integrations, basically anything a human can do on a computer, OpenClaw can do.
This is the agent that finally made AI feel like the thing we were promised, a real assistant living inside the apps you already use, not another tab to manage. Antigravity is Google's agent platform built on Gemini, and the part most people miss is that it's not a website, it's a desktop app, specifically a heavily modified fork of VS Code, so it feels like a real IDE that happens to have an [music] agent living inside it. Mac, Windows, Linux, all supported. It's currently in public preview, free for individuals with generous rate limits on Gemini 3 Pro, no card required. Installation. Google Google Antigravity download, you'll land on the official page. On Mac, check whether you're on Apple Silicon or Intel before you pick a build. Type about this Mac in Spotlight, look at the chip line.
M something means Apple Silicon, anything that says Intel means Intel.
Same idea on Windows. Pick the right architecture. Download, drag Antigravity into Applications on Mac, open it. If you're already signed in to Google in your browser, the app picks up auth automatically. If not, sign in once with your Google account. First run, the layout is VS Code with a twist. Code editor in the middle, file tree on the left, and on the right side a dedicated agent panel where you talk to Gemini.
Type your task [music] in the agent panel. Spin up a simple to-do list app, add, check off, delete, nothing fancy, and Gemini gets to work. You'll see a generating tab at the bottom, a model selector for fast versus Gemini 3 Pro, a thinking tab that tells you how long the agent has been reasoning, and any web searches it runs appear in line. Same general feel as Claude [music] Code and Codex, slightly different flavor, but you'll pick up the UX in 5 minutes. You can also queue follow-up messages, open it, for example, [music] and the agent fires them as soon as it finishes the current step. Where Anti-Gravity wins, anything visual. Hands down the best agent for front-end work, UI mock-ups, design iteration, landing pages, and anything that involves images or video.
Gemini's multimodal stack is just ahead of the others when you need the agent to actually see what it's doing, read a screenshot, check a layout, compare two design variations, >> [music] >> generate a hero image. If your work is design, marketing, or anything visual, this is your agent. Quick rule of thumb to wrap this section. Words and code, [music] use Claude Code. Lowest friction, if you're already in ChatGPT, use Codex. Life automation from a chat window, use Open Claude. Visual or front-end work, use Anti-Gravity. Quick honest moment. Claude Code, Codex, Open Claude, Anti-Gravity, these are world-class general-purpose agents.
Brilliant at thinking, code automation, browser work. But the moment you try to run an actual content operation with them, a YouTube channel, an Instagram, a TikTok, you hit the wall. They don't know your audience. They don't hold a channel strategy across sessions. They don't talk to each other. You end up being the human glue between five chats, and the whole point of agents quietly disappears. That's exactly the gap we built AI Master for. Same idea as the four platforms in this video, agents with a loop, tools, memory, goals, but specialized for content production, and wired together as a team. Three agents talking to each other. The producer agent runs the show, holds your channel strategy, generates ideas, picks angles, decides what gets made and when. The script writer agent takes those briefs and write the scripts in your voice, your length, your format. The designer agent gets the brief like any designer would and ships the visuals, thumbnails, covers, on-screen graphics. They hand work to each other automatically. You sit at the top and approve. It's not just YouTube. Same three agents run any content operation, Instagram, TikTok, LinkedIn, newsletter, a podcast. You describe your project once at onboarding and the team adapts to that platform. If you don't want to figure all this out yourself, we'll do it for you. Picking up the tools is one thing. Building an actual content engine on top of them is a different level of work. And that's the part we've already done. Strategy, scripts, generation, production, publishing, it's the same system running on our channels and on our clients' channels right now. Working like a pipeline, not a pile of chats. You don't get a stack of tools, you get a finished process that produces content and a team that runs it end to end. The link is in the description below. Now that you've seen the four platforms, let's fix the single thing that breaks the most agent runs across all of them, the prompt itself. Because here's what nobody tells you when you switch from chatbots to agents. The way you write a prompt has to change completely. A chatbot prompt [music] is a description of what you want. An agent prompt is a contract, a brief the agent has to deliver against.
Description verse contract. Two different sentences, two completely different outcomes. Chatbot prompt sounds like, "Build me a landing page for my new product." Short, vague, the model fills in the blanks however it wants. Hand that exact same line to an agent and you've just lit money on fire.
The agent has tools, a loop, real autonomy. It'll spin up, start scaffolding files, install whatever framework it feels like, push a dark mode hero section you didn't ask for, and 10 minutes later hand you something you didn't want, but it cost real tokens to produce. The fix isn't a longer prompt, it's a structured one. A real agent prompt, what I call a prompt contract, has four sections: goal, constraints, format, failure. Memorize those four words. Every prompt you give an agent for the rest of your life should answer all four. Let me break them down. Goal, the outcome not the action. The goal section answers one question, what does finished actually look like, not the action, the outcome.
Build a landing page is an action. Build a single page landing site for my new product launch that pushes visitors toward one email sign up above the fold, ready for me to review locally before deploying is an outcome. The first one is a wish, the second one tells the agent how to know when it's done. If your goal sentence doesn't include a clear finish line, the agent [music] will invent one, and you won't like it.
Constraints, the guardrails. Constraints are everything the agent is not allowed to do. This is the section that prevents disasters. Don't install new dependencies without asking. Don't touch any file outside the our landing folder.
No external CDN scripts. Don't deploy anywhere, local files only. Don't pull copy from competitor sites. Constraints exist because an agent will happily do something that's technically inside the goal, but completely against the spirit of it. Every time an agent does something stupid you didn't anticipate, that becomes a permanent constraint in your next prompt. Your constraint list is your scar tissue, it only grows.
Format, the exact shape of the output.
Format is where most prompts fall apart.
The agent might do the work perfectly and then dump it into a structure you can't actually use. Tell it the exact shape you want. Output a single index.html file with inline CSS, no JavaScript frameworks, mobile responsive, plus a brief.md alongside of that list every section in order with the headline copy you used. Or output a landing folder containing index.html, styles. css, and assets. Nothing else.
Be that specific. The format section is where you reach into the agent's head and decide the shape of the deliverable before it starts. If you can't describe the format in one sentence, you don't actually know what you want yet, and you should not have hit enter. Failure. What to do when stuck. This is the section almost nobody writes, and it's the one that saves the most tokens. What should the agent do when it gets stuck? Should it ask you a clarifying question? Should it stop and report? Should it make a best guess assumption and flag it?
Without instructions, agents default to the worst option. They keep trying, looping, burning tokens until something works or you kill it. One sentence fixes this. [music] If you're missing information you need, stop and ask before continuing. Or if a tool call fails twice in a row, stop and report what you tried. Define how the agent handles uncertainty, or it will define it for you expensively. Here's what a contract looks like end to end. Goal: Build a single-page landing site for my SaaS launch next week, optimized to convert visitors into email sign-ups above the fold, ready for me to review locally before deploying. Constraints: Single index.html file, inline CSS, no JavaScript frameworks, no external CDN scripts, light theme only, mobile responsive, don't touch any file outside the landing folder, don't deploy anywhere. Format landing folder contain an index.html plus a brief.md that lists every section in order with the headline copy used.
Failure, if the target audience or core value prop is unclear [music] from the project notes, stop and ask one consolidated question rather than guessing. That's a contract, five sentences. The agent now knows the outcome, the rails, the deliverable, and the escape hatch. 10 minutes later, you have a real landing page instead of a mystery. Try it on your next agent run.
Take a task, [music] write the four sections before you hit enter. The first time you do it, it'll feel like overhead. By the third run, you'll wonder how you ever briefed an agent any other way. Description versus contract, that's the whole shift. Here's the moment most people give up on agents.
They run a task, the agent does something dumb, they correct it, they run another task, same dumb thing, 10th time in a row. People assume the agent is broken. The agent isn't broken. You just never told it to remember. Every serious platform has the same feature with different file names. Claude code reads claude.md, open claw reads agents.md, antigravity has its own version. Same idea everywhere, a plain text file the agent reads at the start of every session before it touches anything. Whatever's in that file becomes a rule it follows forever. Drop it in the root of your project and you've just given your agent a long-term memory. A few weeks ago I was building a customer dashboard with Claude code.
Every single run the agent slipped emojis into customer-facing copy.
Confirmation messages, error states, button labels, little smileys and rockets everywhere. I'd strip them out, next session they came back. So, I just told the agent, create a claude.md file in the root of this project and add one rule, never use emojis in customer-facing copy unless I explicitly ask. This is a B2B product. It spun up the file, dropped the rule in, saved it.
Took about 10 seconds. That bug never showed up again in any project, ever.
Now, layer on the real trick. Make the file self-modifying. Ask the agent to add the rule before you finish any task.
If I corrected you or you hit a bug from a wrong assumption, append a new rule to the learned rules section at the bottom of this file. Now, the agent updates its own memory. Session one, you have one rule. Session five, you have 20. Session 20, the agent rarely makes a preference mistake because it's been writing its own scar tissue the whole time. Three things to remember. First, the prompt contract. Goal, constraints, format, failure. That's what turns an agent from an expensive chatbot into something that ships. Second, memory. One memory file in the root of your project, Claude.md, agents.md, whatever your platform uses, and the agent stops making the same mistakes session after session. Third, the four platforms each win at something different. Pick the one that fits your work and go deep. Here's your action plan. Open one of the four platforms tonight. Pick a real task, not a toy.
Write a prompt contract. Add a memory file with three rules, Claude.md, agents.md, whatever fits your platform.
Run it end to end. Stop reading about AI agents. Start running them. Your future self will thank you.
Ähnliche Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











