Claude Code subagents are designed for context isolation, not parallelism, making them ideal for read-heavy tasks like code exploration, test execution, and documentation review where output is verbose but conclusions are concise; however, they suffer from context amnesia (since v2.1.84) and high token costs (15,000-25,000 tokens per spawn), so the recommended strategy is to use cheaper models like Haiku for subagents while reserving expensive models for the main session, and to use subagents only for tasks where isolation provides genuine value rather than for role-based personas which offer no real benefit.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Claude Code Subagents Are Mostly a Scam. Here's What Actually Works.Added:
Every time you spawn a Claude code sub agent, it forgets your entire project.
No conversation history, no Claude MD, no coding standards, no memory of anything you and Claude have done together for the last 3 hours. Just a fresh empty window and one prompt that the parent decided to hand it. People got mad about this online. The angriest Hacker News comment has 800 points and calls sub agents incredibly unreliable.
And those people are wrong because that amnesia isn't a bug, it's literally the entire reason this thing exists. By the end of this video, you'll understand why context eraser is the feature, [music] the one sub agent pattern that's worth your money, the one that's a complete scam, and the cheapest trick in the book to keep these things from eating your monthly budget alive. A sub agent is a fresh Claude session that the main one can spawn, hand a single [music] prompt to, and get one summary back from.
Different context window, different system prompt, different tools, optionally a different model. The parent doesn't see what the kid did, only what it reports. If you've ever used fork and exec [music] in a shell, that's it. The parent launches a sub process, the kid does its thing in its own memory, prints a result, and dies. [music] No shared brain, no threads, no microservice, just here's a prompt, go cook, come back with a sentence. You define one with a tiny markdown [music] file.
YAML on top, name, description, maybe a model and a tool list. Body is the system prompt. Drop it in.Claude/agents, done. Claude reads the description at startup and decides when to delegate.
Now, here's the part most tutorials get wrong. They sell you on parallelism. Run 10 agents at once, ship faster. That's the sizzle, not the steak. The actual win is context isolation. Anthropic's own docs say it first, and the community [music] took 10 months to catch up. Your main Claude session has a 200,000 token context window. Quality starts dropping around 2/3 full. So, when you tell Claude to run the test suite, [music] and the failing tests dump 40,000 tokens of stack traces into your conversation, that's not test output anymore. That's brain damage. Send that work to a sub-agent, and the noise stays in the kid's window. The parent gets back 23 tests failed, all in off. Here are the names. 200 tokens instead of 40,000.
You just saved your main agent from drowning in its own logs. That's the whole pitch. Parallelism is a side effect. Okay, but the bill? Yeah, about that. Each sub-agent is a full Claude conversation from scratch. New system prompt, new tool schemas, new everything. Before it does any actual work, you've already burned 15 to 25,000 tokens just turning the lights on. Spawn three of them at once, and you're at three times that before [music] a single character of code is written. This is why heavy users on Reddit are reporting that 85% of their monthly usage gets eaten by sub-agents. One developer burned through a thousand dollars of Claude credits in a single evening. Fan out is great until you look at the meter.
The fix is one line of YAML. Run your main session on Opus or Sonnet. That's where the real reasoning happens. But, pin your test runner, your file explorer, [music] your doc reader to Haiku 4.6. Three times cheaper, and on basic search and execution, it scores within four points of Sonnet on SWE bench. Anthropic literally shipped Haiku as the sub agent tier. They built Explorer, the search sub agent that comes pre-installed [music] on a Haiku by default.
Boris Cherny, the guy who built Claude Code, has been pushing this pattern for months. Heavy model plans, light model executes. One developer reported shipping four to five times more work inside the same weekly cap by adding model {colon} Haiku to the right places.
That's it. That's the tip. Okay, now the angry part because not everything in this ecosystem is real. Open the awesome sub agents repo on GitHub and you'll find over a hundred pre-made personas.
Senior engineer, product manager, front-end architect, database admin, whole AI org charts. People are downloading entire fake companies and dropping them into {dot} Claude {slash} agents. Bad news, most of those are theater. The model has no special memory [music] of being a senior engineer.
Telling Claude it's a senior engineer doesn't make it a senior engineer. It makes it sound like one. There's a great Hacker News comment that [music] nails it. This is like launching an entire team of fake-it-till-you-make-it employees >> [music] >> and the costs are real. Twelve role-named agents fight each other for routing, balloon your description loading overhead, and produce zero quality lift. Congratulations, you built a Kanban board [music] for a single LLM.
What actually works isn't the persona.
It's the mechanical isolation. A read-only sub agent that only has read, grep, and glob in its tool list is genuinely different from your main agent. A test runner that returns only failures is genuinely different. The instructions matter less than the guardrails. The rule, make sub agents for tasks, not roles.
Then there's the second thing nobody warns you about. [music] Sub agents don't inherit your Claude MD anymore.
Since Claude code 2.1.84, they actively strip it behind a feature flag. So your beautifully written project rules, your conventions, your house style, the kit never sees them.
It's flying blind in your code base, hallucinating your architecture from scratch. The community calls it context amnesia. It's why so many people say their sub agents return code that looks like [music] the right answer to a different project, because that's exactly what it is. And third, when one of them goes off the rails, you can't watch it happen. The parent only sees the final report. [music] If the sub agent spent 20 minutes solving the wrong problem, you find out at the end, after paying for it.
Anthropic noticed all this. On May 6th, they doubled the 5-hour rate limits for pro and max plans, removed peak hour throttling, and funded the whole thing with a SpaceX deal for 220,000 GPUs. Sounds great. Read the fine print.
Weekly caps weren't touched. So if you're running sub agents around the clock, you'll still hit the wall, just a little later in the week. There's also a quiet new flag. Claude code fork sub agent equals one. It lets a child share the parent's prompt cache, dropping cost per child by roughly 10 times. You're going to fan out, fork instead of spawn.
So when does this thing earn its keep?
Use sub agents for a read heavy fan out, exploring a code base, verifying a diff, running tests, hitting docs, anything where the output is loud and the conclusion is short. Skip them for inner loop feature work, writing code where the next [music] decision depends on the last decision. That's exactly where context amnesia hurts the most. [music] Use them to route execution to a cheaper model. That's free money. Skip them for everything you could solve with a single clear prompt. If you don't need isolation, you don't need a sub agent.
Just [music] ask Claude. The honest answer, sub agents are a real primitive doing narrow work. They're a context firewall with parallelism as a bonus.
They're not an AI workforce. They're not a senior team, and 100 markdown files in your config folder isn't a strategy.
It's a graveyard.
Use them where they hurt to do without.
Skip them everywhere else, and keep an eye on the meter.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 viewsβ’2026-05-28
How agent o11y differs from traditional o11y β Phil Hetzel, Braintrust
aiDotEngineer
450 viewsβ’2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanationπ―β
LearnwithSahera
1K viewsβ’2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 viewsβ’2026-05-29
Search Algorithms Explained in 60 Seconds! π€π¨
samarthtuliofficial
218 viewsβ’2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 viewsβ’2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 viewsβ’2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 viewsβ’2026-06-01











