Claude Opus 4.8 introduces significant improvements including dynamic workflows for handling large-scale problems, a fast mode that operates 2.5x faster and is 3x cheaper than previous models, enhanced honesty training that reduces unsupported claims, and improved alignment with substantially lower rates of misaligned behavior. The model demonstrates a four-fold reduction in code flaws compared to its predecessor and achieves new highs in pro-social traits like supporting user autonomy. Key features include the ability to run hundreds of parallel sub-agents in a single session, effort control allowing users to choose response quality levels, and verification of outputs before reporting. The release maintains the same pricing as 4.7 while offering improved performance across coding, knowledge work, and agentic tasks.
Deep Dive
Voraussetzung
- Keine Daten verfügbar.
Nächste Schritte
- Keine Daten verfügbar.
Deep Dive
NEW: Claude Opus 4.8 + New features!Hinzugefügt:
So, Opus 4.8 just got released. Uh, we're going to go through it, some features along with this announcement, and we might just have a timeline as well for Claude Mythos. Uh, obviously 4.8 builds off 4.7 with improvements across benchmarks. Um, it's available today at the same price, which is always nice to see, right? There's kind of a pattern as these models get better. uh the price tends to go up right now at least once we solve for this you know uh it will become very cheap but it's always nice to see hopefully some of those deals with SpaceX and all those other companies uh help with this so cloud code has a new dynamic workflows feature we'll talk about that shortly here that allows it to tackle very large scale problems they also have a fast mode for 4.8 I know a lot of users do say that uh 4.7 or just claude in general is a little too thorough, right?
It's just like you just want the information or the nugget you need. Um rather it's just very thorough and it's asking all these questions that sometimes bog down the user and and gets annoying frankly. And it's nice to see that there is a fast mode now with that where the model can work at 2.5x the speed and ultimately which is way better is now three times cheaper than it was for previous models which is awesome.
Let's look at the benchmarks again just roughly here. As we can see there obviously is a leap. Just basic indication is that these models are getting better. Right? So we have agentic coding. We'll talk about some of those features. So agentic terminal coding did go up for the ones using coding. That's fantastic to see. That's probably the biggest leap. So for computer use uh as we can see not too much. And then knowledge work as we always kind of said we're seeing the depth of knowledge work in real time. Uh we do have a bit of a jump uh with more tasks it being able to do. Nonetheless, more importantly, what they do say 4.8 is better at this thing called honesty.
We train all our models to be honest.
For for instance, to avoid making claims that they can't support, right? As we all kind of know with some of these models and some of these different companies, it's a bit of a yes man um where they're just acknowledging everything and and all this other stuff alongside that. So, but the notion of honesty is, as they say here, it's a little hard to do, right? It's a little hard to accomplish, especially on the back end with all the weights and stuff.
Uh, nonetheless, it seems like they've done a better job with 4.8 than 4.7.
I've said this sometimes, it they they say they do it and they don't do it, and it just creates a bunch of confusion.
Apparently, 4.8 is better at that. Opus 4.8 is around four times less likely than its predecessor to allow flaws in code. It has written to pass unremarked, which is great to see. Um, our alignment team concluded that Opus 4.8 8 reaches new highs on our measures of pro-social traits like supporting user autonomy.
The assessment also showed Opus 4.8 to have rates of a misaligned behavior such as deception or cooperation with misuse that are substantially lower than 4.7.
This right here is very big in my opinion. Um, you know, again, very hard to align um and to kind of mitigate the misaligned behavior. However, it looks like they're doing a good job at that.
For instance, 4.7 misaligned behavior was quite high. Sonnet 4.6 was the highest. Nonetheless, 4.7 was very high misaligned behavior. Okay, I would be interested to see 4.6. We look at Mythos and 4.8. It's pretty close to Mythos.
Okay, so again, they are solving for this in real time. So, some of the features that accompany this announcement, dynamic workflows. This new feature available in research preview currently uh allows cla to take on even bigger tasks in cloud code. CL can plan the work and then run hundreds of parallel sub agents in a single session. I think that's very cool. Um, as we know with each model release and each model um they get better and better, right? So now you actually can use AI in my opinion like a true partner, right? That you can actually um like a thought partner and spar with um if you will. So, with that being said, they started planning like this feature of plan first started coming in uh a few models ago, right? And I love it. I I've always been a big advocate of having to plan first. And you always used to do it manually, but now the model's kind of doing it for you, which is nice. And now you can run hundreds of parallel sub aents in a single session. And obviously with every new model, it looks like the run time is even longer than prior. So it then verifies its outputs before reporting back to the user just like any good employee or individual should. And if you want there's more about this feature uh right over here. So effort control in claw.ai and co-work okay has finally come out. Now we did know with 4.7 they took a new direction when it came to essentially effort and how these models work with tasks. Uh nonetheless a new control alongside the model selector lets users choose how much effort Claude puts into a response. So obviously co-work does take up more tokens. So it's nice now to have this ability to essentially uh determine where our tokens are being spent which is always nice. So when it comes to effort 4.8 defaults to high effort which we judge to be the best overall balance of quality and user experience on coding tasks. This effort level spends a similar number of tokens as Opus 4.7's default uh but with better performance.
Users can choose extra x high in cloud code or max and the model will spend more tokens to get better results. All right, so what's next? So obviously mythos sounds like it's around the corner with what I'm about to tell you here. First, we're working on developing and releasing models that provide many of the same capabilities as Opus, but at a lower cost. That's always great for the user experience. This is important.
Not only that, but we plan to release a new class of model with even higher intelligence than Opus. All right, so this is obviously the project glasswing helping kind of pioneer this new model.
Small number of organizations are currently using cloud mythos preview for cyber security work. We're making swift progress on developing these safeguards and expect to be able to bring Mythos class models to our customers in the coming weeks. Potentially within one to two months we will have the Mythos class being brought to the public. So when it comes to availability, uh you can have access to Opus 4.8 right now. Uh, I just basically closed my CL desktop, turned it back on, and it seemed to be here.
Not sure if you have to go through an update or not, but pricing is unchanged.
Lovely. And now you can actually use 4.8 via the cloud API. If you have any questions, put them below. Subscribe.
Stay human.
Ähnliche Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











