Install our extension to search inside any video instantly.

Anthropic thinks you're an idiot
Added: 2026-05-05

32,175 views94014:28NeetCodeOriginal Release: 2026-04-17

NeetCode provides a necessary reality check by exposing how "adaptive thinking" functions more as a cost-saving mechanism than a genuine leap toward AGI. This critique highlights the growing disconnect between inflated corporate benchmarks and the actual stagnation in model reasoning capabilities.

[00:00:00]Anthropic just released Claude Opus 4.7 and that means we have achieved AGI.

[00:00:05]Coding is dead. Leak code is dead.

[00:00:07]Everything is dead. Vim is dead. The benchmarks have been absolutely crushed.

[00:00:12]Now, in seriousness, I'll just go ahead and say it that this model is not that significant. I don't believe that this is a very significant moment. If you're an AI hater like I used to be or at least how people saw me. You might be wondering, well, when is there ever a moment or a model that releases that actually matters? Well, last year, I think when Opus 4.5 released, I think around November, I think that was a moment that people probably needed to pay attention to, but I don't think that every month or every other week when releases a new model, I don't think it's, you know, as big of a deal as most people make it out to be. And I think that this time that is the case. But I think there are a few things that are worth talking about. So I'm going to try to go through them. But I understand that some people are going to think that now I'm some kind of AI hype bro. And I do think that's a little bit funny because what exactly are you guys coding or is anybody coding where LLMs are making the work 10 times as fast? But that's okay. Now we're going to go through this. We're going to look at the benchmarks. We're going to look at what some of the people at Anthropic are saying. We're going to look at what a lot of the users are saying because it's been a very interesting few weeks at Anthropic for their PR. A lot of uh negative and questionable things. So, in case you haven't been keeping up, this is Anthropic's newest model and also the newest one available to the public, but it's not Anthropic's best model. That one is unreleased. It's called Mythos.

[00:01:45]It's the one that they said is too dangerous to release to the public. And I think there's some truth to that in that if it can find really significant uh security issues and stuff, which it already has, a lot of zero day uh bugs and some pretty big projects like FFmpeg and OpenBSD. And so currently, I think they've only given Mythos access to like some government agencies and some companies. Uh, but this is the one that is for all of us, for all of us dumb dums, because we of course can't be trusted with the good stuff. And so, we're going to look at the benchmarks.

[00:02:19]And I don't want to like completely invalidate these because I think benchmarks are more of an art than a science. Um, but we have seen recently that this study from Berkeley, I won't go through this, maybe it warrants a separate video, but the TLDDR is that basically AI agent benchmarks can be gamed and they have been gamed in the past. And that's something I feel like any reasonable person could tell that it's not just in the way that like they're calculated, but even we've seen sometimes that like the companies uh I think it was either Anthropic or OpenAI, maybe both, maybe Gemini as well, where they show these graphs and they're not even to scale. So it'll kind of show one bar is bigger than the other, but the numbers are like the same or something.

[00:03:07]So there's a lot of BS going around, a lot of hype, and that's just part of the game. But looking at the benchmarks, at least compared to Opus 4.6, it's pretty much better in every category. But there are a few uh for example, Agentic Search, it's not as good. Cyber security vulnerability reproduction, it's slightly worse. The rest it's uh not too different, but when you actually sit down and use it, which I've done a little bit, in my humble opinion, it's just not that different from Opus 4.6 and 4.5. And I think there are some reasons for that. It's functionally different in a few ways. I'll talk about that as well. But when you compare it to, of course, Mythos preview, it is worse in most of the categories. But even that, I think it's like since this isn't available publicly, and that's when you actually get to evaluate how good these are. I think that's when we'll know like how much of Mythos was actually hype, how much of it was BS.

[00:04:04]I'm not saying it's all hype, but I think that does matter because we we know that based on these benchmarks and I've recently become GPT 5.4 pill. It took me a while to like use it enough to realize that it is absolutely better than Opus in terms of just thorowness.

[00:04:20]Like if you're just going based off these benchmarks and you see, okay, well, it's worse than Opus 4.7 and agent coding, you you might think that it's just worse in general. But so far from the little that I've used 4.7 and it's been out for like a day, it hasn't been that different in my opinion. And I felt the exact same way about 4.6 as well. To be honest, I feel like 4.5 strangely was the best one. In some ways, it feels like that these models are actually slightly worse. I don't think that's actually the case. I think there's a lot of work that goes into the harness, the harness that is clawed code, which is mostly how I've been using the Opus models. I used to use it in open code, but they made that not allowed. But either way, I think we can probably expect OpenAI to release their newest version. I think there are actually people already testing that. I'm allowed to say that because I'm one of the maybe few content creators who doesn't get early access to it. Maybe I should request that, but then I won't be able to talk about it. But I'll get to use it early, so maybe that's worth it. There was actually another really important thing worth talking about that they did not really say much about in the announcement. Maybe that was intentional. Again, I'm trying to be generous, but it was adaptive thinking, and it's a bit controversial right now because at least from what I've seen being discussed, there was this Reddit thread and basically if you have this new setting, adaptive thinking, toggled off, then it will basically not think at all. And we know that all the models at this point are thinking. And so some people are mentioning that maybe this is another way that Anthropic is trying to save on compute that actually since the model takes more tokens. That's not really the case. But who knows? I think clearly cost and capacity is an issue and it's probably going to be for the foreseeable future. So who knows what kind of tricks that these companies are going to use on us. So for Opus 4.7, adaptive thinking is the only thinking mode and by default it is off. Why did they deliberately make that choice? I can only guess. Basically, instead of you now being able to choose whether the model that you're paying for or using is going to be thinking, Claude is going to make that choice for you. And that sounds really nice, but what if it's wrong? It's wrong all the time. So now, basically, uh, the request that you make, Claude is going to evaluate the complexity of it to determine whether it should use extended thinking or not. And it looks like it is somewhat dependent on the effort level that you set. So if uh you set your effort level to high or extra high, it looks like most likely it's going to be thinking. But at the lower effort levels, it won't do that.

[00:06:56]So at that point, it makes me wonder why even have the adaptive thinking in the first place. I don't know. I can't help but think they're trying to save on cost or something. Unless they're going to clearly communicate to us why they're introducing this, which Anthropic is not the most clear communicators. I'm just not personally buying it. I guess adaptive thinking enables interled thinking. Basically, it can think between tool calls. Uh it's it's kind of interesting the philosophy that Anthropic is trying to follow and not really being transparent about. This part is going to be a bit controversial because they said that we would keep Claude Mythos's preview release limited, but they're also going to test new cyber safeguards on the less capable models.

[00:07:41]And Opus 4.7, I guess, is the one that's smart enough now that they're going to start um limiting it in ways. And our boy Theo actually had a tweet about this because he ran into an issue where there was a lot of false alarms for him as he was playing around with it because he kept running into an issue where Claude uh hallucinated or thought for some reason that he was trying to do some kind of like exploit or find an exploit.

[00:08:06]And of course, this is a legitimate issue because like as AI is getting better, if you can just go find exploits in any type of like software that's running, that's not legal. But if you can still do that, there's going to be obviously issues from it. And so Anthropic is trying to do the right thing, but I think people are always nervous when somebody's trying to play big brother and tell them what like boundaries that they're allowed to stay in and what they can't do. And especially when it goes wrong like this.

[00:08:32]And uh in fairness, Anthropic actually did respond. Our boy Tariq or Theiqi said that this was actually an issue in previous versions and then I think once Theo upgraded, he didn't have that issue anymore. And the issue itself was with a system prompt. So it was something that was easily able to be corrected. But this is a feature. This is not a bug.

[00:08:53]This is a feature that's baked into it as we saw with the announcement. This is going to probably be happening going forward with any of the best models that we are mostly using. This one I found a little bit funny. Again, I'm not a hater. I don't hate anybody. Boris, the creator and the person running Claude Code at this point said that Opus 4.7 uses more thinking tokens and I think the number is like roughly 35% more. So, they've actually increased rate limits for all subscribers to make up for that.

[00:09:21]Uh people asked, of course, is this temporary or permanent? And they uh Boris mentioned no plans to change it.

[00:09:28]Um, I'm sure he means that, but it does make us a little bit nervous. If you've followed the things that have happened the last couple weeks, there's been a lot of drama around the limits. And again, trying to be generous in our interpretation of it because right now there is an issue with compute. We've seen all the AI cat videos, but I don't think that that's like the main driver of it. There's a lot of AI code being written now, like exponentially more and more, and it's causing issues uh for a lot of people. like GitHub is getting more code pushed there than ever before and so they're dealing with a lot of issues uh like their uptime and everything like that. It's easy to just dunk on people. So if Anthropic is literally running out of GPUs, people are using this so much, it's being subsidized so much that people are going to keep using it, it makes sense that they'd have to adjust the rate limits.

[00:10:17]But when we see now that they're actually increasing the rate limits that they put out a brand new model which I don't think it's probably 35% better than the other one than the than Opus 4.6 or 4.5 but it's taking 35% more tokens. So if that's going to cost you more money that's an issue but okay they say that they're actually increasing the limit so it's actually not going to cost us more money. Okay. I know they're just trying their best but it is kind of confusing as a user. I think the benefit to using uh tools like this is that there's so much competition, you can just very easily switch to codecs.

[00:10:49]There's some minor things like it's supposedly supposed to be better at literally following instructions. So hopefully when you have a prompt, you tell it to do these steps. It doesn't just ignore some of them, which gets pretty annoying after a while. It's supposed to be better at images so it can more clearly interpret images. So hopefully like if you have some text in images, it'll be able to read it more clearly. Looks like there is a slight improvement, but like I said, I don't think this is an inflection point. I don't think this is a moment that is very important. They also released a new effort level between high and max extra high. And it looks like they're recommending to start with high or extra high. I don't really know 100% if that's because it's best, if that's going to be the most optimal like cost, or maybe they're just running out of inference.

[00:11:37]So, they're recommending that because I know they I think they set by default the effort level to medium. They might have bumped it to high, but I think it was initially medium and a lot of people were pissed at that because they didn't really announce anything. They just kind of like set it to medium without telling anybody. Obviously, they needed to do that to save compute, but obviously if you're paying for something and somebody's just changing the settings without you knowing it kind of would piss you off and that's one of the many reasons that Enthropic has been getting bad PR lately. And so here's the mention about them using an updated tokenizer which uh can end up costing 35% more in terms of like the input tokens. The prices themselves for the Opus models haven't changed for a while. This one doesn't change them either. That's the good news I guess. But if it's obviously 35% more tokens, it's going to cost more. But I mentioned competition is good and what we see Sam Alman subweeting anthropic. I'm happy everyone is switching to Codeex, but Tibo, if you start rate limiting me or making me use worse models dot dot dot. Obviously, that's a dig at Anthropic. The rate limits got worse and Anthropic wasn't the most transparent about it. In the past, they even said that they were never going to do something like that.

[00:12:49]And people are pointing out that the models are somehow maybe getting worse without us knowing it. That's a bit of a conspiracy theory. I think it is technically possible, but I think it's more likely that there's some like other side effects going on. Like maybe they're just doing it because of like the GPU shortage or some other reason because it's not really in Anthropic's own self-interest to make their own stuff worse intentionally, unless there's some kind of hidden motive behind that. One thing people point out is they make the models worse right before there's a new launch so that it feels dramatically better. Could be possible, but I'd be a bit skeptical of that. But on the competition point, I do want to point out I'm glad that OpenAI exists. I'm glad that they even have a slightly different philosophy about this. It sounds like if they were to have something like Mythos cooking in the the lab that they would probably be inclined to release it publicly, but even that I'm not 100% sure. At this point, it seems like OpenAI and Claude are just trying to do the opposite of what the other one is doing. We saw that Anthropic wasn't willing for their technology to be used by the US government for certain war things and OpenAI was just happy to fill in that place. I'm not necessarily saying that's a good thing. At the very least, Anthropic seems to care about ethics.

[00:14:06]Maybe they're not doing that in the best way, but I do give them credit for trying to care about security and trying to care about the good of the world. I assume most of their employees do, too.

[00:14:17]But that doesn't stop them from screwing up a million other ways. But anyways, let me know what you guys think of this model. I think it'll probably have some advantages over the previous ones, but I don't expect it to change my life dramatically.

Related Videos

Artificial Intelligence

OpenHuman VS Hermes AI: Who Wins?

JulianGoldieSEO

285 views•2026-05-29

Artificial Intelligence

Long-Running Agents — Build an Agent That Never Forgets with Google ADK

suryakunju

142 views•2026-05-30

Artificial Intelligence

This computer is made from real human brain cells. And you can buy it.

Talktmsmedia

3K views•2026-05-28

Artificial Intelligence

BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2

aimmediahouse

122 views•2026-06-03

Artificial Intelligence

I Made the Same Anime Fight Scene in Every AI Video Generator

NobleGooseAnime

295 views•2026-05-30

Artificial Intelligence

Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S

cnnnews18

3K views•2026-06-01

Artificial Intelligence

I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)

AICodingDaily

298 views•2026-05-29

Artificial Intelligence

3D Platformer Update - NO CAPES

SolarLune

294 views•2026-05-30

Trending

Computer Science

The Meta AI Hack Is a DISASTER

LowLevelTV

141K views•2026-06-03

Paris is in SHAMBLES right now 😭

H1T1

4053K views•2026-05-31

The Casino Had Us Guessing All Day

VegasMatt

157K views•2026-06-03

The Dancing Plague...

HoodieGuyStories

1730K views•2026-05-30