Install our extension to search inside any video instantly.

Subsidised Coding Agent Prices Won't Last. Here's the Play.
Added: 2026-05-18

101 views105:16brainqub3Original Release: 2026-05-11

Frontier AI coding agents operate on venture-subsidized pricing that cannot sustainably cover their all-in costs, making multi-agent system design essential for long-term viability; this approach involves using frontier models like Claude Opus 4.7 only for orchestration tasks that genuinely require high reasoning, while deploying cheaper open-source models or mid-tier models for specialist sub-agents, and implementing agent skills to encode workflows and reduce token consumption by preventing agents from reasoning from zero on every run.

[00:00:00]If you're running coding agents or building a business on top of them, the pricing you are building on is not sustainable.

[00:00:06]Here's what I think you should do about it. I kicked off my coding agent on Saturday morning after UAT [clears throat] run, same as I usually do. My son and I went to the park. I had GitHub open on my phone the whole time, watching it intermittently as my agent pushed through the issue list and raised PR's one by one. It's a remarkable thing. The productivity I get from this system for a few hundred dollars a month, I would have to pay for in those for a human engineer to match it. But there's always this uneasy feeling underneath. I know there's no way the frontier labs are making money on me at these prices. The all-in economics, training, computing, infrastructure do not work at the current consumer subscription rates. These prices are venture subsidized. And standing at the park, watching the PR's coming in, the question I kept circling back to was, what happens when the subsidy runs out?

[00:00:54]That question got sharper for me a few months before Anthropic released Opus 4.7. Tasks I ran routinely in Claude were consuming far more quota than I expected. Claude code felt like the bottom eyes. I noticed it and I couldn't find a clear explanation. Then Anthropic published a postmortem in April 2026.

[00:01:13]They admitted that between March and mid-April, three product layer changes had degraded Claude code. Reasoning effort was dropped from high to medium to reduce compute costs without telling anyone. A verbosity cut caused a measured 3% drop in Cody Evals. And a caching bug wiped thinking history between turns. It was resolved, but the implication was not subtle. A major frontier lab under real compute pressure dialed down the capabilities paying customers were getting. They did not announce any of this. What do you do?

[00:01:44]The obvious move is open source. The model intelligence gap at the frontier has been closing. Epoch AI measures the average lag at roughly 3 and 1/2 months.

[00:01:52]On the artificial analysis intelligence index as of May 2026, Kimmy K 2.6, the top open weights model right now from Moonshots, scores 54. GPT-5.5 needs a 60. Claude Opus 4.7 max scores 57. On Sweet Bench verified specifically, Kimi K 2.6 hits 80.2, Opus 4.7 hits 87.6, GPT-5.5 hits 88.7. The gap is narrowing, so the temptation is to swap out your frontier model for Kimi K 2.6 and keep going. Here's the problem.

[00:02:24]Kimi K 2.6 is a trillion parameter mixture of experts model. 32 billion active parameters per token.

[00:02:32]To run it yourself, you are looking at eight H200 GPUs at FP8 quantization with a reduced context window. That is around $21,000 a month in cloud GPU costs.

[00:02:45]For small business, that is not an alternative to frontier subscriptions.

[00:02:49]That is frontier pricing with infrastructure to maintain on top. There are smaller open source models, and I do think they have a role. I will come back to that. But the thesis that open source models is a straight replacement for frontier capability is wrong, unless you have frontier lab budgets for model serve. So, what's the actual move? My strongest conviction right now is that multi-agent design is the answer. Not building a new harness from scratch.

[00:03:14]Harnesses like Claude Code and Codex already support sub-agents. Many open source harnesses do the same.

[00:03:21]The move is getting more intentional about how you allocate token spend across your agent systems. Here's the concrete version. If you have an agent that creates social media content, does it need Opus 4.7 at max reasoning to write a LinkedIn post?

[00:03:37]It does not. What makes sense instead is a system where the orchestrator is Opus at high reasoning because the orchestration is genuinely hard, and that is where the frontier intelligence pays off. But underneath, the specialist sub-agents are cheaper. Sonic for writing the actual post, a mid-tier model for web research, another for fact-checking, and another for quality assurance. You are not spending expensive tokens on tasks that do not need expensive tokens. And here is where the small open-source models come back in. The sub-agent layers is actually where a well-tuned narrow open-source model can do real work. Leave orchestration to the frontier, let the open-source handle the specialist tasks.

[00:04:19]That framing changes how you design everything. Stop thinking which model should I use and start asking, which tasks actually require frontier intelligence and which ones do not.

[00:04:29]Treat tokens as a scarce resource you allocate deliberately across the system.

[00:04:34]Frontier for orchestration, cheaper for everything else. The other piece of this is agent skills. Sub-agents alone reduce costs, but when you combine sub-agents with agent skills, which are defined workflows the agent follows rather than trying to figure everything out from scratch, you cut the trial and error cycle combined that inflates costs in the first place. Instead of the agent reasoning its way from zero to a working process on every run, the skill encodes that process.

[00:04:59]Token spend goes down, output quality goes up because the agent is executing a known pattern rather than trying to improvise. I have a Claude skills course linked below. If you want to understand how to think about Claude skill currently and how to build and deploy them properly. That's it. See you on the next one.

#claude ai #ai agents #claude mcp #model context protocol #mcp servers

Related Videos

The #1 Reason Your Top People Keep Leaving (How to Fix It)

Entreleadership

470 views•2026-05-29

What Happens After A Motorcycle Dealership Shuts Down?

FastestWay.1

374 views•2026-05-29

The Evolution of DSP's Pokemon Unpack-ack-acking Grift

Toxicity_Unmasked

2K views•2026-05-29

Help re-structure my finances, I want to buy a house, save and invest

JennNxumalo

2K views•2026-05-29

Asian Paints Q4 Results: Revenue Beats Estimates, 5 Key Takeaways For Investors

NDTVProfitIndia

111 views•2026-05-29

Trying to Afford Vancouver on a Single Income | $2,550 Mortgage

chelseaspursuit

308 views•2026-05-28

AI Investment: Data Centers & The Bottom Line

MemeTeamClips

134 views•2026-05-28

Are you busy but still feeling broke?

TaraWagner

305 views•2026-06-01

Trending

Computer Science

The Meta AI Hack Is a DISASTER

LowLevelTV

141K views•2026-06-03

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30

The Fastest Way To Board A Plane 😮

zackdfilms

6504K views•2026-05-29