This video explains how AI-powered vulnerability research systems like Mythos work by combining large language models with expert-designed frameworks and harnesses. The speaker demonstrates that while AI can find vulnerabilities through pattern matching, the real breakthrough comes from building sophisticated multi-step pipelines that automate reverse engineering, driver analysis, and exploit development. The key insight is that AI doesn't replace skilled researchers but rather amplifies their capabilities by automating repetitive tasks, though significant barriers to entry remain due to the expertise required to build effective harnesses and the high costs of running frontier AI models.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
0days galore this week - come talk ai and cybersecurityAdded:
Good morning.
How we doing everybody?
Uh, my claude code instance just hit 529 errors. So, it's a day that ends in Y.
Gosh, man. Just like impossible impossible to like string together a productive few hours on a tool that goes down every day.
What's up, Brandon? RZ, Galaxy, Colin, Kilobyte, how we all doing? You're watching a webinar. No, you're not.
You're watching me.
GitHub actions is also down. That's really funny because I made a tweet the other day that was like, "Not now, honey. Claude Code and GitHub are up at the same time because it feels rare."
Um, and here we are. They're down at the same time. That's pretty funny.
Um, yeah, I know you're multitasking.
You know what I was multitasking yesterday? I'm getting scuba certified.
Turns out that's not easy. Uh I didn't know what I was getting into. I've never been scuba diving before. I'm keynoting a conference that is like scuba related uh conference this summer and so I had to go get certified beforehand. A it's expensive. B, it's gonna like it's going to be like 30 hours of work between this week and next week. Like, what the [ __ ] Like, my part-time job is scuba certification. My job is beach.
My job is beach.
Uh, how we all doing?
Get into underwater welding as a side gig. It pays well, right?
probably pays well because I would die yesterday just two hours stream. Yeah, it's probably going to be the same today too. Um unless for some reason uh we get a ton of energy and momentum.
Uh, I'm going to be honest. Be honest.
Suffering a little on the YouTube burnout train. Little bit. Little bit.
You know, putting a lot of time, effort, and honestly money into uh into all this to get like not a lot of juice from the squeeze. Just being a little transparent. Just being a little transparent. It's been uh it's been kind of hard. It's actually impacting business at this point. impacting business at this point. Had some rough conversations this week.
Feeling kind of shitty. Feeling kind of shitty, but we're going to do it. Why am I saying everything twice? Um, so we've got zero days to talk about. Let's not No one's going to No one's going to like hang out and listen to me like wallow about this [ __ ] Um, and then if cloud code is down as a day and end in Y, how about consumer edge devices uh vulnerability day that ends in Y. Um, I don't beat up on uh a bunch of these too much because I just do understand what a target they have on their back and we all, you know, have put out vulnerable code. The issue here is with a lot of these things specifically okay this pan uh panvone on an off portal um if you follow like the vendor's own advice like these things just should not be on the internet right um like why are the management portals for these things exposed to the internet so frequently so frequently so this was just published today, but it like so we had so but we saw active exploitation already.
So this was being exploited as a zero day.
Um let's see.
Is this page not going to load?
Boom. You like the little shoo meter?
Speed meter.
Um, what is the vault? I haven't like I haven't even really groed what the vault is.
Buffer overflow in the user ID authentication portal uh allows unauthenticated attacker to execute arbitrary code with root privileges.
Yeah, that's about as bad as it gets.
Um, it's specifically for the user ID O portal service of Pan OS.
Um, but as per our best practices guidelines, you should be restricting access to trusted internal IP addresses.
basically don't put these things on the internet because these vulnerabilities happen a few times a year uh at this point.
I think I read though. Um, yeah. Yeah, Brandon, I uh I thought that you were you're big on this. Uh, I was reading on Twitter today, though. I think the act of exploitation is pretty wild at this point, though.
All right.
Oh, wait, wait, wait. Did you Did you guys see this? Sorry to I was pulling up Twitter to try to find the active exploitation stuff. Did you see this insurance company to like that came out to try to cover where AI messes everything up for you?
>> The insurance team here at Ki. Your business uses AI, but does your insurance policy cover what happens when AI goes wrong? Your agents are making real decisions now. Sending emails, moving money, and shipping code to production, and sometimes they get it wrong. Today, we're excited to launch AI coverage insurance for when your AI goes rogue. an agent leaking customer data, copyright claims from generated content, and go to corgi.com to get covered.
>> That's pretty funny.
It's pretty funny. When when AI goes wrong, uh I think I was seeing they're going to make bank. Uh I don't know. They might have a lot of payouts. Might have a lot of payouts.
insurance uh makes money on the land between you paying for it and them not paying out.
Um all right, the rapid 7 blog didn't talk about this, but exploited in attacks is the part that I wanted to chat about.
Zero day stems from the buffer overflow.
Uh exploitation limited exploitation has been observed targeting them.
Customers following our best practices are unimpacted.
Um, oh, this just is saying exploited in the headline because they they say it in the observation. I was trying to find some stats.
Let's see. Can I do this without logging in?
Yeah.
Um, looks like Census sees a couple thousand of these that would be exposed on the internet. Best practices. What are those? Yeah, totally. Totally. Um, so yeah. Anyway, I mean likelihood if Census is only seeing 2,000, I mean the this is going to be a rough day for these couple thousand um companies.
I'm trying to see if we can like see any details.
But yeah, I mean hopefully hopefully we've learned our lesson at this point that we are not to be exposing Wait, what is this guy saying?
About 107,000 exposures.
Oh, but then on these authentication portals lower. Got it. Got it. Got it.
Port 443. I mean, there might be it the reality might be somewhere between this 1,000 and 107,000, right?
Um, if you're not doing some of these default ports and you're routing everything through 443, if it if the 443 is that authentication portal exposed on the internet, just like not on these strange ports, then uh, you know, some subset of those might be part of it.
But that is not the only zero day that we need to worry about today because well, we got to see if we actually need to be worrying about a few of these.
[ __ ] where did it go?
The Apache zero day. Okay, so I saw some conflicting reports here. Any of you who respond, I know some of people in chat yesterday were responding to this one.
Uh, were you impacted?
Because I've got some people calling it a nothing burger who I trust because it's not a default setting.
You have to have this like mod HTTP2 thing turned on.
But I saw some people run some numbers.
That's why I wanted to read this one. I don't know what this is, but I saw this come across a thread intel feed.
Um, oh, is this how they found the vault?
Yeah, this is how they found the vault.
Okay. So, this is the researcher that found the vault.
Um, all right. Super interesting. I just, um, I don't know if we need to give a [ __ ] about this or not.
Because if we look at you're I'm going to show you why I'm confused, right?
Because this is what I saw out of uh out of this was probably vulnerable to this CVE 52 million servers.
Right. I guess we don't have a ton of Oh, god damn it.
Not every Apache server.
So, this is every Apache server. I would have thought every Apache server would have been a much bigger number than 52 million. So I assumed that this was um well I know this heat map reflects all of Apache HTTP services. Okay. Okay.
Okay. Okay. All right. So I guess my bad on assuming that this was I I would have thought 18 million does not seem like enough Apache HTTP servers in the US.
Um, and so then I was talking to someone this morning about it.
What the [ __ ] is this?
LZ was talking about it and I was like, man, it just still seems like still seems like uh tens of millions of of servers are vulnerable.
This vulnerability is not present in default Apache installations.
Exploitation requires mod HTTP2 to be explicitly enabled and configured.
Systems running the default module are not vulnerable.
All right. Ubuntu put it out there as like a high priority.
Really? Like what is this page? Oh, per upstream this only affects HTTP2426.
For investigation, the core issue is still in early releases, but without the memory allocator change in mod HTTP2, which went in in HTTP2 466, it does not result in a double free. Marking releases earlier than resolute as not affected.
Huh.
What up, Kilbite?
I'm lazy.
Why the [ __ ] are you calling me lazy, dude? I don't understand. Only 98 stops today. Wow. It's a It's like a third of yesterday, huh?
Only thing I was impacted by this week was the Robin Hood. Oh, you got that fishing email.
All right. I even this note is is actually kind of confusing to me, right?
cuz like it's saying it affects this version and then it's saying it affects earlier versions if you do this mod HTTP2 thing HTTP2 thing.
So some versions are impacted by default specifically this one and then some need this mod turned on.
I don't know.
I don't know.
Why are you calling me lazy, bro?
Who gives a [ __ ] about a DOS attack? Not me.
Not me.
Um, so yeah, between the the PaloAlto one and the Apache one, vault management is just so tricky because it's like you have to figure out if you're impacted first, which is just nuts, right? like at that scale it's like oh man what Apache servers versions are exposed to the internet is a very hard question to answer for a lot of organizations as you saw tens of millions of these HTTP servers which ones are vulnerable um one of the other Uh we've talked about like the FBI's cyber crime report uh mentioning how prevalent these romance scams are. Looks like London Police Department uh put out some stats. I'm trying to go to the website and their DNS is down.
Oh, wait. Are we back up?
Oh, we're back up. Dude, this website was down for the last like two hours. as I was like prepping for stream.
Um 102 million pounds of romance fraud last year, which for just romance scams is kind of nuts. Old people looking for love. Not just looking for love. I I'm getting DMs from people that I know that and whose parents I even know whose parents are wrapped up in these things and they just refuse refuse to acknowledge that they are part of a scam.
Is muddy water old news? I just saw I just saw that headline. One second.
Where was that? I just saw the headline.
No, this is new news for sure. This is like in the last couple minutes this uh this article came out. All right, we'll get to it. Sorry, got distracted. So, um 10,000 reports of romance fraud equaling 102 million 29% increase over 2024.
So yeah, the the giant increase is because it's successful.
Yes, I have par Yeah, I have friends parents. I have multiple friends parents. One I know uh was out 30 grand.
One I know was out 60 grand. the 30 grand victim, their kids are like cutting off all financial access to like try to figure out how to like get their mom to stop bleeding money out to who they think is like some celebrity that they're in a relationship with or whatever it is, right?
the um so they they are still fully bought in that they are in this relationship and their kids are the bad guys for like turning off their credit cards and bank access and all this kind of [ __ ] and they still find ways to milk more money out of them. Like she's going to the store and like buying gift card like whatever weird thing like she's finding new and new ways like to send them money and her kids are in the wrong. It's bad. the the my friend's parent who lost 60 grand like is so embarrassed that not even reported. So I'd say this is an under reportported stat. So the stat is worse than this because this is only the ones that actually call the cops after they they fall for it. Right.
I hope you're kidding about the Nigerian prince.
um 280 p uh pound 280,000 pounds a day.
Individual victims losing an average of 9,500 pounds.
One victim lost a million pounds.
A million pounds.
That's insane, right? Hey, thanks for subscribing.
Welcome.
This is broken down by region.
Um, people 55 to 74 suffered the greatest financial loss loss, accounting for almost half of the total.
Uh, I'd be interested if if you just did 55 plus like like because 70 like this isn't an upper limit, right? So, how many older than 74s um are there? Because this is like this sentence leads me to think like, oh, there's a ton of younger people falling for it, too. But men submitted a higher number of reports overall. Women experienced greater financial losses.
You don't have a kid. Also, you're cyber. Oh, Roman, you're hanging out on the YouTubes.
What up, dude?
Um, it's only going to get worth worse with vocal approximation and Oh, yeah, dude.
These scammers aren't even doing any of like the the cool AI deep fakes.
Like they're in easy mode at this point and they're this successful. You're right. This is going to get a lot worse.
All right, Muddy Waters. Someone asked about Muddy Waters. Let's let's uh let's figure it out. What are we talking about here?
um Iranian state sponsored hacking group known as Moneywater aka Mango Sandstorm, Seedstorm or Static Kitten. So this is not Hondala, right?
This is a different group has been attributed to a ransomware attack and what has been described a false flag operation.
Um although initially the know the incident initially appeared to be consistent with ransomware as a service group under the chaos brand evidence points towards it being a targeted statebacked attack that masquerades as opportunistic extortion.
So that's super fascinating. Okay.
Because do I have this? No. No. This is copy fail because I was reading about this chaos thing.
Um like cuz generally I don't really care too much about attribution. I don't like spend a lot of time. I definitely like acknowledge it and we talk about it.
We're like why why does this matter? All this kind of stuff.
Um but in this specific paragraph and case this matters a lot right because this is like hey is this cyber crime or is this statesbacked activity and it's and even more interesting is it statebacked activity that for some reason is masquerading as cyber crime to like mask the statebackedness of itself. Right.
Um, super interesting, right?
Oh my gosh, guys.
Fresh Thread Intel.
All right, we'll get to it. This is hot off the press. fresh intel about a ClickFix campaign that's going after Maxo Mac OS utilities. Interesting.
Interesting. Interesting. We're going to um hold off on that.
Um yeah, because I don't generally Can we break out the taxonomy?
Um you mean the naming stuff for Iran?
Uh, Static Kitten would be like their version of Scattered Spider, which is like a crowd strike thing. Uh, Mango, Sandstorm. So, Sandstorm is Microsoft's naming convention, I believe. So, Sandstorm is um, Iran and Static is Iran. You're right. I should bring it up. Hold on.
Uh, let's find the pictures. I need the pretty pictures.
Um, okay. This isn't a pretty picture, but um, so blank kitten is crowd strikes. Oh, I thought it was the static. That's funny. Okay.
Um, Palaltto says serpents.
Cobalt.
Wait, why don't we have Microsoft here, which is the most popular? That's so funny. See, so even the naming convention helpers leave a bunch of stuff out.
Here we go.
Yeah, they've been saying that they're going to unify for a while.
They haven't. All right. Um, here's the picture. Why do we have this like low resolution picture?
or is that a stone of thread actors?
Anyway, I'm this is just getting more confusing. It doesn't matter for what we're talking about. The point being super fascinating that a state sponsored group is trying to not appear as a state sponsored group, right? That's super interesting.
campaign was characterized by a hightouch social engineering phase conducted via Microsoft Teams states. Okay, this is the [ __ ] the hacker news article. This is the actual thread intel report. Okay.
Um, a sophisticated drink DRINK >> sophisticated intrusion initially appeared to be a standard chaos ransomware uh, attack. Okay, let's let's kind of fly through this. Maybe I shouldn't have said [ __ ] the hacker news article. Maybe we should just read that.
Um, all right. So, if you guys aren't familiar with ransomware as a service stuff, th this is what happens on these sites, right? So, like this is how we can kind of like track ransomware victims in the thread intel community is you get these sites where they like list um data that they've stolen uh and the ransom like goes up on this site kind of thing. So, that's what that's what you're looking at here.
So, chaos is one of the more popular ones.
Now we're saying um initial access via teams social engineering screen sharing session that gets on to are these sites payable this rabbit 71. Now you want the link.
Boop boop.
So the thread actor achieved initial access through social engineering via Teams. It's just it's forever insane to me that external entities can message you on Teams.
Like I can't get a random DM in Slack.
Like why can I why can external entities message you in Teams even if they're like marked as external? I I just do not understand why this is like a thing.
Like, of course, this is going to be an attack avenue.
I don't get it.
It's like one of the main value ads of these apps versus email is like they're a trusted walled garden, right?
It's so dumb.
All right. So, Thread Actor writes people via Teams initiated one-on-one chats with users from a controlled account. During these interactions, the thread actor establishes a screen sharing session. So, they're pretending to be like the help desk or whatever, right? While connected, the thread actor executed basic discovery commands and accessed files related to the victim's VPN configuration, instructed the user to enter their credentials into locally created text files. I mean, come on.
just like, "Oh, you're my help desk and like here we are. We're going to do this stuff.
You have to turn it off allowing certain demands." And of course, people don't do it. We used to have random people join our meetings and had to self police our meetings to remove unknown callers. I mean, it's just insane.
Absolutely insane that that's like a thing.
I remember during COVID when Zoom got really popular, uh like Zoom meeting bombing was a thing. Like a lot of Zoom meetings were somewhat public by default and you could like guess the URL to join the meetings. So people were just like scripting these things. Hashtag mythos. Yeah. Look at this. Look at this hyper sophisticated AI superhacker that's breaking into all these computers.
>> Mythos.
>> Yeah. Mythos. Dude, I hit that soundboard button on a webinar.
on a webinar I recorded yesterday, like a professional webinar.
Why? Why do these people invite me to things? I have no idea.
Credential harvesting.
Um, yeah, they because they just trick them into putting their creds into a file.
Um, remote access, payload delivery. Let me guess, DLL sideloading. Yep. Um, lateral movement, extortion, and activity and data leaks. Okay, malware analysis. We don't need to read right now, but I'm super interested in the masquerade.
Is this where we have?
Yeah.
The convergence of technical and contextual evidence is consistent with attribution to muddy water with moderate confidence. Yeah, I guess attribution is never foolproof, right?
The observed use of the chaos chaos ransomware does not indicate a shift in the group's underlying objectives, but rather reflects a consistent effort to obscure operational intent and complicate attribution. So they're basically saying this state sponsored threat actor is using this cyber criminal playbook to not appear as a state sponsored hacker. While attribution evasion is a common characteristic of state affiliate actors, muddy waters reported increase in operational activity as early as 2026, primarily involving cyber espionage and potential pre-position.
Yeah, war because war um assessment of lines of previously observed behavior.
Okay, so they've done this before. Money water was using uh uh how the hell do you say this out loud? I only ever read it. I never actually hear people say it.
Kylin. I don't know. It's like the other It's like a major popular ransomware as a service and I feel like an idiot not knowing how to pronounce it. Following the subsequent public attribution of the incident um plausible the group adopted alternate ransomware branding in this case chaos.
Um, this doesn't really say why they think it's Muddy Water except for like Muddy Water's done this before, right?
May enable the attacker to blur the lines. So they're just saying like yeah furthermore the inclusion of extortion negotiating elements could serve as the to focus defensive efforts on immediate impact likely delaying the identification or underlying persistence mechanisms.
Notably the apparent absence of file encryption despite the presence of chaos represents a deviation from ransomware behavior. This inconsistency may indicate that the ransomware component functioned primarily as a facilitating or obuscation mechanism rather than the primary objective of the intrusion.
So I guess reading between the lines this probably has a lot like a lot is writing on um a lot is riding on the targets. I'm guessing the targets and data stolen are probably and they're not really talking about that a lot here, but the targets and data stolen are probably like juicy for muddy water is what I'm assuming.
Reading between the lines.
Reading between the lines.
Um Gosh, that that Muddy Water story must have went out to the journalists before Rabbit 7 must have PR that that out to people because like every news site has an article about it within in the same like moment.
Does this say anything different?
Uh just kind of pointing towards like it doesn't look like they were trying to gain too many finances.
Um, all right. What else we got?
Gosh, so many like sponsored articles show up in my RSS feeds that you got to kind of like dig through.
Um, do you guys see this Damon Tools thing Damon Tools?
What did Kasperski say about this?
Why is this site not loading?
Um, tampered with installers for Damon tools, a popular program used to mount disk images as virtual drives.
So, it's a supply chain attack. Hey, finally a supply chain attack not in npm or pi just straight up messing with installers of a program. The malicious version first observed in April affected multiple releases.
Operation appears to be targeted. Most victims received only basic information collector designed to gather system data while a second more advanced payload was deployed just to a handful of targets.
That's interesting.
Um, is this site being hugged to death because this thread intel is juicy?
Because none of the pictures are loading.
The attackers likely used initial data collection to profile infected systems and then some special snowflakes got a rat on the second stage.
How did they more importantly much more importantly how did they do this?
How did they poison the installer for Damon Tools? Why is that not talked about?
Hello.
Cool.
Damon Tools is being used.
How How' they break into Damon Tools and infect the installer?
Do we know hackers trojanized the installer of Damon tools? Supply chain attack led to thousands of infectants. Then we just start talking about the victims and then the malware.
So Damon Tools has just not said anything about this.
Excuse me.
Was this on GitHub?
Was this a developer and maintainer of uh Damon tools. That was I don't know.
All right, let's um let's see.
Again, this is just talking about the malware. Malware. Malware.
H I don't know. Maybe I'm dumb. I'm just not seeing it.
Not seeing it.
Oh no.
Oh no. Guys, look what button I'm about to hit.
Infosac red alert because of mythos.
ALL RIGHT. India's securities regulator urges market players to develop new strategies and nails cyber basics before the AI model fuels mass attacks.
JFC JFC JFC even if it's not overhyped like This is so dumb.
No, I was tweeting I was tweeting that.
Sorry. I thought I was sharing the tweet screen and I wasn't.
This is dumb.
I just I just tweeted exactly what I said out loud. I was like, "Do cyber basics is advice that should be followed. It has nothing to do with mythos."
The advisory offers some basic infosc advice. Ensure patches are up to date.
Conduct audits of potential vulnerabilities. Conduct inventories of APIs and secure them. Run a serious socket. Take its advice. Harden systems.
What does any of this have to do with Mythos?
Oh, speaking of, let's find that.
If it's not overhyped, then apparently the basics won't say that. I mean, it's just like, hey, if you're not doing any of that [ __ ] what are we, you know, what are we doing? What are we doing?
Um, speaking of speaking of I want to see if I could figure out a diff.
Do we have a diff?
No. Don't make me make an account to download this.
No.
No, don't make me make an account.
uh AdSense. What does this have to do with mythos? Yeah, SEO maybe. I mean, no, I just think it's like the normies are scared and it's just like, hey, by the way, you should have been scared this whole time.
Hey, by the way, uh, security securities commission of India, if if you're not patching your vulnerabilities and you aren't doing any cyber basics whatsoever, guess what? The AI superhacker isn't uh the thing that you need to worry about.
Come to learn that nobody is doing that [ __ ] on some level business. Yeah. Yeah.
Yeah. Yeah. Yeah. Yeah. Yeah.
Dude, I talked to like a pretty I guess in the scheme of things, they're not that big, but they're kind of a major company here in town and uh no IT department, just like handing people laptops. I don't even think any two people at the company have the same exact laptop. It's like some guy like just Google and buys a laptop the day that they hire someone. no device management, monitoring. Like my buddy got hired, needed Windows for like certain software that he had. He like had to log in with his own personal one drive account. Like not, you know what I mean? And I mean, we're talking like pretty darn successful uh company here in town with like a national presence for sure.
I was like, "Well, when you guys get scared enough, call me."
Oh god.
What does this say?
It does what?
Microsoft Edge. Well, there's your first problem. Keeps every saved password in process memory as clear text from the moment it launches.
Microsoft said this is by design.
all of them, including credentials for sites you won't open this session. This researcher tested every major Chromium browser. Edge is the only one that behaves this way.
Chrome decrypts creds on demand.
In Chrome, plain deck surfaces only during autofill.
What makes this extra edge still demands reauthentication before revealing those passwords in its password manager UI?
While the same browser process already holds every one of them in plain text in shared environments, this turns into a credential harvest.
On a terminal server, an attacker with admin rights can read the memory of every logged on user process.
In the publish PSC video, a compromised admin account lifts stored credentials from two other logged on users with Edge running.
I mean this like reads as bad but like what like in what environment do we have a terminal server that has multiple users using Edge and storing their passwords on it together?
And then if the attack requires admin, like aren't you aren't you effed anyway?
Aren't like Is no one else calling this out? Is no one else saying like I kind of don't care? No, it's just a bunch of [ __ ] AI responses.
I will stop using Edge right away. I mean, no.
I mean, like, it's obviously not a good idea.
I don't know.
I don't know. Like that doesn't sound interesting, but like I mean it doesn't sound good obviously, right? But like, do we give a [ __ ] I don't know if we give a [ __ ] All right, I think we're out of headlines I give a [ __ ] about.
Do we want to do a quick bit of react stream?
I think there was some interesting man. This is so [ __ ] depressing.
Why is my sword broken?
Oh, so [ __ ] depressing.
So, I did I did my GitHub YouTube video a few days ago and like two bigger creators just put theirs out within the last three hours on the same topic.
Virtually the same titles and thumbnail concepts and they're crushing it. And mine like absolutely [ __ ] flopped.
Like absolutely [ __ ] flopped. And um that's super super super frustrating that like I kind of just always assumed that like my title and thumbnail concepts were a major weak point of mine and it just seems like that's not actually the case. Um all right. What did I want to There was a few videos that I was starting to watch.
Marcus. Oh, yeah. Yeah. Let's do this.
>> So, Marcus, who's like, uh, Roman, I don't know, man. I'm just bitching. Like, I'm super [ __ ] burnt out on my YouTube performance, to be honest. And like it's just super f like I understand if my concepts are weak and like my performance is weak and like whatever, but if like man if like the same concepts and hooks and everything are just literally flatlining compared to other channels in my niche, it's that's just [ __ ] frustrating.
All right.
Um, Marcus is like not he's kind of an AI hater, so I always like listening to AI haters who are touching AI. Um, and he's also a fantastic malware researcher. So, I think we could learn a lot from whatever he's doing uh to build kind of an AI zero day finding harness thing.
I decided to make an AI zero day pipeline, a bot that you feed code into at one end and it spits out zero days at the other. Simple. Except like nothing about that was simple. About a month ago, I received an email from Anthropic that told me I've been accepted into their CVP program, their cyber verification program. And what this is is a program that allows verified security practitioners current generation models, but without certain guardrails. So, in my case, my version has no guardrail against writing malware and no guardrail against finding vulnerabilities or writing exploits for them. So, of course, my first thought was, well, I'm going to try and build a new machine. So, I'm basically going to test Claude's ability to find and exploit zero days in software entirely autonomously, but I basically saw people freaking out about Claude Methos. And I wanted to learn more about the risks. I felt like a lot of the theories were overblown, the arguments that this is going to give scripties the ability to just churn out zero days like there's no tomorrow. The sky's falling, cyber security is over. And I felt like a lot of that was missing the nuance. And I thought, well, what better way to illustrate my points than to actually make my own version. Now obviously I don't have the resources to build a machine that can just exploit any vulnerability in any software. So I decided to pick a specific class of vulnerability in a specific software and just go from there. Now I was very deliberate in what class I wanted to pick and it had two main reasons. So the class that I picked was privilege escalation and the operating system that I picked was Windows 11. So I specifically wanted to get privilege escalation to kernel mode via third party vendor drivers. Now this is a technique known as BYOVD. Basically on Windows you have your standard uh privilege levels, you have standard user administrator but above that you have kernel access. Now, it used to be that you could just load drivers into the kernel and do whatever you want. Uh, but Microsoft realized that that was a problem because the EDR and all the security products, they live in the kernel. So, if I can get into the kernel, I can simply just disable your organization's multi-million dollar security product, which is not very great. So, then in Windows 10, Microsoft came out with a new feature, which is that you have to have your driver signed by Microsoft. So you sign the driver, you then have to submit it to Microsoft.
Then they will audit it and decide whether it actually gets approved for use or not, which makes things a little bit harder for criminals because if I say go and steal some company >> Yeah, I did provide some like resume stuff about like my uh just about stuff that I've done that proved that I was like a cyber security guy. But I I haven't heard of anyone getting denied, right? If it's just like, oh, I'm trying to do bug bounty research. Like they were like, oh, hack like link your bug crow or hacker one profile. like I just threw everything at it because I didn't like I did it like when it first launched so I had no idea how critical they were going to be but I think people are getting pretty pretty easily approved um from what I hear company's uh digital signature and I sign a malicious driver I still have to submit it to Microsoft and at least in theory they should see that my driver is malicious and just deny it which has been a massive problem for thread actors because EDR ransomware detection has gotten so good that in a lot of cases if you run ransomware on an endpoint it'll just stop it because having to open and close millions of files and encrypt them is very very noisy and it's very easy to detect. So ransomware actors have been struggling against EDR ransomware protection for quite a while. So then the easiest thing to do is if you can get into the kernel you can just disable the EDR. You don't have to worry about the ransomware protection. And this is actually what had started happening.
Essentially what thread has found is you don't need to sign your own malicious driver. You don't need to steal a digital signature or get your own through a front company with administrative privileges on a Windows system. You can load drivers but they still have to be validly signed by Microsoft. You can take someone's legitimate signed driver, look for a vulnerability in it, and then exploit that vulnerability to get kernel access.
And that's where the name bring your own vulnerable driver comes from. And this is an attack that's been going on for a while, and you see it a lot with ransomware actors. And I would argue it is probably the most common zero day exploit that you see cyber criminals using. So that's why I felt like this research was relevant. Now, exploiting kernel drivers is not easy. It is binary exploitation, which is widely considered to be one of the most difficult classes of software exploitation. So I don't want to make things too easy for the LLM. like if I hand it uh some cross- site scripting challenges or some SQL injections, it's going to breeze through those easily. I need something that it's actually going to struggle with. So, I picked binary exploitation. But the problem is if I'm looking for like remote code executions and like the most hardened OSS on the planet, that's going to cost a lot of AI tokens and a lot of my time. And this research is probably going to take years. So, I picked what I call the bottom of the top, the easiest of the most difficult class of software vulnerabilities to exploit, vulnerable vendor kernel drivers. It's still binary exploitation. It's still uh relatively difficult, but these third party vendor drivers are terrible. the code.
>> I think it's a super interesting concept, right? So like um he's trying to prove the capabilities um but not in like the hardened operating system published code, right?
So it's like oh the the easiest of the hardest was like a really interesting concept. I had watched about this much of the video before I decided I was like oh let's watch this on stream.
>> Code is awful. It's full of security holes. They do stuff that >> should be saying like the third party vendor drivers that are put out are just much much worse uh in terms of code quality than like the big operating system vendors ones. So I think this I I'm following his hypothesis and I agree with it >> just not be allowed in any driver ever and it makes them fairly easy to exploit. So it sort of sits in that balance where it's still difficult enough where the LLM is going to struggle a bit, but it's easy enough that I'm not going to have to spend like hours and hours of time and millions of dollars in AI tokens and building out this infrastructure. So my goal was to basically build a pipeline where I feed the colonel drivers in at one end and then I get zero day privilege escalation exploits out at the other. So I decided to follow Anthropic's lead with coming up with Greek names. And so I decided to name my bot Althos, which is Greek for completeing idiot. Now when it comes to working with large language models, I follow a variant of Murphy's law, which I made up myself. Murphy's law basically states anything that can go wrong will go wrong. Now, my law basically states that anything in LLM can mess up, it will mess up. So, if a task does not need to be done by an LLM, it shouldn't be. So, the goal was to make as much of the pipeline that didn't >> or like that anything in LM can mess up, it will mess up. I think we're going to use that going forward.
>> Need to use LLMs just use Python scripts. So, to start out, it was pretty easy. We need to find drivers. So, I built a script that scrapes the internet for drivers. Now, we need to make sure they are validly signed and will run on Windows 11. So, the script then runs them through a process that first checks if it has a valid signature. Great idea.
Then check if that signature uh belongs to the Microsoft WHQL program, which is the program that decides whether this is allowed to run on newer systems or not.
And then we check it against the vulnerable driver block list cuz we don't want to be finding uh vulnerabilities in already known vulnerable drivers. So the first stage of the script just sort of weeds out anything that's not going to be useful to us. And then the final stage of driver selection is I have a VM which the AI agent logs into and will load the driver on and make sure that it actually loads because there are uh other systems that it can fail. Um it might just crash the system. Um but also uh Windows Defender application control works a bit like an antivirus. So it might block certain drivers based on criteria we just don't know like they have some machine learning model on the back end that decides is this driver good or bad and we don't necessarily know what that is. So the only way to check is to just load the driver and see if it loads successfully. So once we've done that we get a much smaller list of drivers. I started with something like 100 drivers and then once we've run through all of those different processes we were down to about 84. Now the next process is reverse engineering. Um when it comes to LLMs they don't do a very good job of reverse engineering. They are much better at finding vulnerabilities in source code. Except I don't have the source code for any of these drivers and I'm not going to be able to get it. So the goal was to basically reverse engineer the drivers back into source code so that I could feed the source code into the LLM. This was great because I happen >> I mean wouldn't another way to do this would be to just feed the binary to the LMS because like that seems to be a thing that what did we just see?
What example did we just see where a zero day was found just with the binary which like was odd right for um because like AIS are good at code review and this wasn't even code review it was binary analysis. What the [ __ ] Van? Did we just Man, we were just talking about it. Um, and I was like super super copy fail. No, I'll think of it.
>> Happen to know a professional reverse engineer. Um, except I would be cheating if I reverse engineer the drivers myself because the whole point is making an automated pipeline.
>> Oh, just keep watching the [ __ ] video, Matt. Here he goes saying maybe I won't reverse engineer myself.
>> So I basically tried to put as much of my skill and knowledge into a simple script as I could and then basically built an IDA pro script to sort of automatically do some reverse engineering. It would do things like give types to script.
>> It was IDA. What the hell? What research just used IDA MCP and Claude?
What just what just use that?
Come on. Why?
What [ __ ] vulnerability that we've just talked about used?
Oh my gosh. What on just came out and the researchers used IDA Prom MCP?
Um, come on. I gotta remember it. There was Whiz. Yeah, whizes. Oh, the GitHub one.
Jesus Christ, guys. The GitHub enterprise thing was a binary researcher. I was like, this is why I knew I needed to figure it out before we moved on.
Whiz researchers loaded GitHub enterprise just like blackbox binary uh did not reverse engineer it themselves used IDA promc and founder mode code execution and then translated that to github.com which of course we don't know the code of you know what I mean like github.com is not open source but github enterprise that same feature existed crazy so yeah well Marcus's whole thing is LLMs will mess everything if they give it get a chance.
So he's doing some of it himself, right?
>> Names name functions based on any obviously detectable behavior. One of these sort of big issues I started running into with the LLMs is a lot of drivers use something known as KMDF or kernel mode driver framework. And this sort of operates through a bunch of pointers. Like rather than calling functions directly, it'll call through a vtable and then the the vtable will direct the call somewhere else. Now the LLM was really really struggling to figure out like what are any of these calls doing? Like where are they going?
they just it just sees like a call going to an address and then the address points somewhere but we don't know until load time where that address actually points. So I built a simple script that actually pauses the KMDFV table and notates all of those functions so that the LLM can know where the calls are going. So I build this script that the driver gets open in IDA automatically.
The script applies all of these descriptive names and types and it notates the KMDFV table and then it goes on to the next stage which is using the IDA decompiler which basically allows us to decompile the uh assembly code to not technically C but something that looks like C. It's called IDEO code, but it is close enough to C that I then just built a simple script to turn it into C. Now, it's not going to compile. It's nowhere near valid, but it's descriptive enough and clean enough that the LLM is going to be able to work with it. So, all of the drivers go through this process.
>> Super interesting. And like literally up to this point, no AI, right? So, he's like scripting the IDA, scripting the IDA pseudo stuff into C, all this kind of stuff, right?
>> It's entirely automated. It runs IDA in headless mode. uh disassembles, notates, decompiles, and then it spits out a C file. So, my 84 uh driver.cis files become 84 C files, and now they're ready to go into the large language model. And we've avoided using LLM up until this point because there's so much that they can mess up. As I mentioned a minute ago, they do really seem to struggle with reversing the kernel mode driver framework. Um, which is really easy to do statically, like it's a very simple Python script. So, I did that for them.
Now, the next step was to build a prompt to sort of find the vulnerabilities. So my prompt basically explained what a BYVD attack is. What kind of vulnerabilities tend to be used in BYVD attacks, uh certain drivers that wouldn't be useful, certain drivers that would be useful, and then I told it to give every single driver a score. And the prompt does a lot of housekeeping because when I would run it, the LMS would run into certain failure cases. So then I put that failure case into the prompt and explain uh like do do it this way or avoid this or make sure to check this. And so I just sort of iterated and I kept building this prompt up to get better and better and better until I had something that I was pretty happy with.
Now, interestingly, my first version of the prompt, which was a test prompt that literally just said, "Find vulnerabilities in this driver, was actually successful in finding valid vulnerabilities. It just had like a very high false positive rate." So, I spent a bit of time iterating. And then, when I finally was happy with my prompt, I added a bit on the end that basically just says, "Generate two scores. An exploitability score on a scale of 0 to 100 on how exploitable or how vulnerable you think this driver is, and then a BYUd usability score, which is how usable you think this driver is for a BYOVD attack." So I like this part of everything that we're seeing around vulnerability research and and uh using AI and this stuff is is actually I think the more impressive and uh more interesting bit of anything like when I see a lot of these automations is is we've had vulnerability scanners forever, right? But like um false positives have just been a thing, right?
And it's taken humans to like wade through this stuff. Um, and so any sort of like AI powered validation steps towards the end of the research is actually way more impressive to me.
>> And then it just sort of spits out all the reports into a directory and then it creates a JSON file scoring each driver on a scale of 0 to 100 which means I can then go and I can look and I can pick like what are the highest scoring drivers. Now this is the part that actually surprised me. The LLM was finding valid vulnerabilities that I myself confirmed by hand. And the reason why this surprised me is simply I just have a very low opinion of LLMs. Like for me the bar is in the floor. Um but it also didn't surprise me in that I realized that LM are essentially glorified pattern matching machines and all vulnerability research really is is pattern matching. Whenever I go and look for vulnerabilities manually whenever I'm like looking for zeroday exploits I would go and I would reverse it like existing patches like patches for known vulnerabilities. I would look at what sort of situation led to that vulnerability arising. To give a very very simple example think about how a stack overflow works. A stack overflow occurs when too much memory is uh copied to a finite sized buffer like a finite sized region of memory. So you simply just need to look for the pattern of we have a finite region of memory and a function that copies a non-finite amount of data to that region of memory. Now stack overflows aren't something you find a lot in modern software. They're super rare and even when you do find them, all of the mitigations that have been built against them usually make them non-exloitable. But this is just kind of an example like once you have that pattern of fixedsized memory, non-fixedized copy operation, we can then go and look for that in any code.
So my personal favorite was integer overflow. So I basically made up this mental pattern of like what does a standard integer overflow look like? And then every codebase I would look at, I would look for that specific pattern.
So, it's not really a surprise that LLM are very good at this because it is really just pattern matching. Most of the work was actually refining the prompt to reduce false positives, not increasing the true positive rate because it was already doing a pretty good job at finding uh the stuff that I would want defined. So, I run my script that feeds the C file in along with the prompt into the uh API and then it gets back the response and saves it to a text file. So, we get 84 vulnerability reports for 84 different drivers. Now, this was notably expensive. I ran the bot and I just noticed my balance just dropping and dropping and dropping.
Like, every time I refresh the page, it was going down by $5. Now, I knew not to put too much money into my account because I had foreseen the chance that it's going to just go wild and it's going to drain my entire bank balance and I don't want a 30k AI bill. So, I think I ended up putting about $200 into my account and then every time that would drop below say $50, I'd put in another $100. And as the AI was sort of like failing and succeeding and failing and succeeding, I honestly felt like I was using a slot machine. I was putting $100 into the slot machine and praying that this time I get a usable zero day vulnerability. Now, this is the part where math can be used like really disingenuously to instill fear or I can just be honest with you. Now, the disingenuous framing is that it only cost about a dollar to $2 per driver to find vulnerabilities. My first production prompt cost about a dollar per driver. And then, as I sort of iterated to get the best possible prompt I could, that cost about $2 per driver.
So, I scanned about 400 drivers and I found one, which was the perfect candidate for a privilege escalation uh to kernel zeroday attack. Now, because it cost me $2 per driver, I can tell you I just found a kernel privilege escalation zero day for $2. except I didn't because I had to go through those 400 other drivers in order to find this one sort of diamond in the rough. So what it actually cost me is $800 just to find this one driver and we are not at exploit yet. We are not at the exploit stage. This is just to generate vulnerability reports. We still need to triage these and then pick which driver we want to exploit and then exploit that driver. So the driver that I picked is commonly referred.
>> Why why wouldn't we be doing this on like a max plan? Like why are we why are we even worrying about API costs? I don't I don't know >> to as a god mode driver. It basically exposes an IOCTO interface an IO control interface which allows a user mode process to talk to the driver to pass data to it or receive data from it. Now what kind of process can actually talk to the driver is uh 100% up to the driver. It has to set certain security controls and certain uh privilege checks. But the driver that I found doesn't set any at all. Which means that not only do we not need to be a system process. We don't even need to be admin.
Theoretically any privilege level process that can actually open a handle to the driver will be able to exploit it. So this might actually work uh from a guest account or even from within a sandbox. Now in terms of what the driver actually does, it has a couple of different IO control codes. One allows us to read memory and the other allows us to write memory. Now the way that it does this is via a function known as mm map IO space which basically maps a physical address. So the system's actual underlying RAM. So where does that address come from? Well, the user mode process just passes it in. There's no sanity checks on it whatsoever. So essentially what this means is the user mode process can pass any address in the systems RAM and this driver will map it into memory and then allow us to read to it or write from it. And because we're not dealing with virtual addressing, we don't need to worry, is this address mapped? Is it readable? We can just scan through the entire RAM until we find whatever we need. So that completely breaks ASLR. And then we can do the same with writing, we can write to any address we want. So once we find something that we want to write to, say an area of executable memory, we can just write shell code straight into that address. So we don't need an execute primitive at all because the write is the execute primitive. We can just write some shell code into any executable region of memory and then overwrite a pointer that we know gets called. The kernel is calling all kinds of addresses all the time and all we need to do is overwrite one with the address of our shell code. So, this is one of the most powerful exploits that you can get when it comes to kernel drivers because it essentially allows us to read and write the entire systems RAM. And because the RAM isn't just for system memory, it's also for talking to hardware devices. We can do a lot more than just mess with the operating system. We could theoretically overwrite firmware in things like the hard disk, the graphics card, and even the systems BIOS or UEFI.
So, this is an extremely powerful vulnerability, and it's actually kind of surprising that Microsoft still even allows these to be signed. Now, this is the part of the video where I admit I actually skipped a step. I was so excited to see that one perfectly vulnerable driver that I ran with it and then I started building the uh the exploit part of the process, the uh the LLM prompt to turn this vulnerability report into an exploit. Now, I actually ran into a really big problem here, which is I couldn't get it to exploit the driver and it was costing so many tokens for each attempt that I was down like $200 with nothing to show. So, I ended up moving to the UI and on the UI pay $20 a month and you get like >> essentially infinite tokens. It's not infinite, but it's close enough. So, I moved to the UI where I wouldn't have to pay per token and it took about four.
This is what I was just talking about.
Like, but why? But it didn't need to be UI. It's not UI versus API. Like, you could use the $200 a month plans, not via the UI. Anyway, hours of back and forth until I could figure out everything that Claude was doing wrong, like every step of the way where it was going wrong. And then I built that into my prompt and I kept like iterating on the prompt saying, you've got to take this into account and you got to take that into account and so on. Um, but I skipped an entire step there, which is this is an autonomous pipeline that takes in code and spits out zero day vulnerabilities. But I picked the vulnerability. I went through the reports and I picked the best uh vulnerabilities to use. And this is where I think actually a lot of skilled vibe coders go wrong. They end up attributing a lot of what they're doing to the capabilities of the LM model. So essentially I just went and I used my like decades of vulnerability research knowledge to pick out the perfect exploit candidate. Um and I was like wow I got an exploit. But then when I went back while doing this video I was like hang on I did that. Like the LLM didn't do that. So I need to go back and >> wild. So he didn't even realize till he was doing this video that skilled vulnerability researcher with AI pipeline can find vulnerabilities more at you know news at 11. It's not just methos, >> RIGHT? SUPER hacker just like goes and does [ __ ] by itself. Like we haven't seen that like at all, right? That one mythos blog says like, "Oh, non-skilled anthropic employees found zero days overnight." Like RC zero days overnight, right? But like yet to be seen, right? Souped nuts. So, I mean, still super fascinating, but it's really cool for him to a acknowledge that, b uh not realize it till he was making this like summary video, which is super interesting.
>> I need to get the LLM to select the vulnerability. Uh the LLM only allows me to upload 20 files per request. And uh we have 400 drivers, so I can only upload 20 of them at a time. So, I was like, okay, that's easy. We just do like semi-finals and finals. We take a batch of 20 drivers, another batch of 20, another batch of 20. And then we tell the LLM, "Pick the best of these 20."
And then we take the best of each batch, and then we put them into the final, and then we say, "Pick the best of this 20."
And then we finally get out our exploit.
Now, my driver didn't even win its heat.
Like, it was it didn't even go through to the next round. And I was like, so I started trying to like figure out the LLM's logic, like where it was going wrong. So, I looked at what drivers it did pick, and it was quite clear what it was doing wrong. Essentially, you have like three different classes of exploit primitives. You have read, write, and execute. Now, execute seems like that might be the better of them because that allows you to execute code. But let's think of the simp.
>> Sorry, if I keep scrolling down, I'm like reading his chapters which are very helpful to understand what he's talking about in that moment.
>> This version of each of these primitives. I can put in an address and the kernel will read that address for me. That's the most basic uh read primitive. The same for write. I put in an address and the kernel will write that address for me. The most basic execute primitive is I put in an address and the kernel calls that address. But how does the code we want to get execute get into the kernel? Like just calling stuff doesn't give us code execution. We need to get our shell code into the kernel. So without a right primitive, an exploit primitive is useless. So essentially what the LM had inferred is we want code execution, right? So the execute primitive is going to be the most valuable which does intuitively make sense if you don't have vulnerability research experience. The way in which the kernel works is it's executing all kinds of code all the time. It's calling function pointers.
It's running driver code. If we overwrite some code with our own that's going to get executed. So a write primitive will allow us to execute code.
We don't need an execute primitive.
Whereas an execute primitive on its own doesn't allow us to do anything because we have to still somehow get our code into the kernel. But then with our write primitive, we can write our code into the kernel and have it executed. Except most modern systems have something known as ASLR, address space layout randomization.
>> Yeah, I was just thinking the same thing, Ed. Right.
>> Basically, every time you start your operating system up, the system just reorders everything in memory. Like it just puts everything at random addresses. So if we wanted to say overwrite a driver or overwrite a pointer, that would be great and all except we have no idea where in memory what we want to overwrite is located because of the address space randomization. So that's where a read primitive comes in. we need some way to leak uh addresses from the kernel so that we can figure out the memory layout so that we can figure out where we need to write our uh shell code or write our like whatever exploit primitive we're using. Execute primitives are pretty much irrelevant. Sometimes they can be helpful, but a good read and write is way more powerful than any execute primitive. But the AI doesn't know this, right? And this is what I think people mean when they say in order to have AGI these uh machines need to have a world model because it doesn't know uh like any of this stuff. It's just sort of regurgitating text from its training data. So unless I prompt it and say look we want read and write primitives and not execute primitives it's just going to naturally infer oh executing code execute primitive and it doesn't have a world model it doesn't know how a system works it can explain ASLR if we ask it what is ASLR it will perfectly explain it because it has that data in a training set if we ask it what is a read primitive what is a right primitive it can perfectly explain that but because it doesn't have a world model and it can't sort of put all these pieces together and understand how it's interacting with the system it can't come to the conclusion we don't actually want an execute primitive at all we want a good read and write so I basically had to explain what I've just explained to you to large language model I had to build into my prompt um a big framework that basically said look uh kernel address based layout randomization does this this mitigation does this that that mitigation does that you're looking to do this here's why execute primitives aren't useful uh here's what kind of read and write primitives are and I was eventually able to get the LLM to do the same thing I did and pick the correct most >> okay I I'm I have very similar experiences to Marcus uh in in my research too it it is a bit of a headbanging to get from like this is a vone to like this is a serious vone, this is an exploitable vone. Oh, and by the way, here's the exploit. Like I mean it, you know, it's super super hard to get there. Uh Claude specifically uh is pretty quick to call things vul and like critical, right? And then like, hey, I know what I'm looking at.
Like let's tone this. You know what about this? this that oh you're absolutely right you know this is something this is something this is something >> honorable driver uh as the exploit candidate but this was a lot of my own work this wasn't the LLM doing it was me sort of like taking my thought process and putting it into text for the LLM and this is sort of where I push back on this idea that LLM is just going to keep getting better and better and at some point script kitties will just be able to type in hack windows and it gives them a zero day because what actually made the difference here >> yeah I completely I mean this is this is why we literally have THE SOUNDBOARD RIGHT >> LIKE THIS is like the way that this is being portrayed trade is like super hacker machine that you point at things and it will hack them. Like no. No. And that's not even the direction I don't think like we're not even pointed in that direction, right?
>> It was not the large language model cuz while I was using Opus 4.6, notice I didn't say 4.7 which is out at the time of making this video. It just feels like they completely loed it somehow. So I'm using Opus 4.6 which is like a frontier model. I honestly think I could have >> Yeah, Ed, good point. I mean, so the you're right, the the the thing that they say in their graphs is like the jump between vulner identification and exploit development is what Mythos is good at, but then like when you actually read the posts, it was also like in a very specific harness, right? Um, so what's what's the secret sauce there?
because then people were also able to replicate. Ed, did you see that post that I um that we went through on I don't know if you were on stream. Um let's find it. Did you Did you see this post? I I put it on Twitter, too. This is like one of the more This is probably the best No, no, no. Last week. Uh this is probably the best blog post that I've read this year. Um from Neil's uh a legend. I don't know if you know Neil's but uh was at Google as a security researcher and his whole thing was trying to replicate a lot of these mythos vans that we do know about with commercially available models 46 sonnet 46 and then even openweight models like GLM51.
Uh the legend here the legend part here which is just super super funny is like the OpenBSD bug that's like 27 years old. That was like the headline Mythos bug. He's like, "This was super meaningful to me as I was responsible for committing to OpenBSD TCP sack implementation including the bug in November of 1998." He wrote the bug and then he was like, "Oh, I want to be able to replicate these mythos findings with non- mythos models." And then I mean the rest of this blog we went into was he did like he did and he has been able to like and it's it's really all about the harness harness optimizations, workflow generation, all this kind of stuff. And so, you know, I don't know.
It's it's hard to it's hard to be like, well, where where do we draw the line on what Mythos is better at behind the door, you know, behind the scenes? I don't know. I don't know. I'm talking to Project Glass Wing people. Apparently, it's still ripe with false positives, but it is finding like true rce and [ __ ] that it's been around for a while and reviewed for a while. So, done this with a pretty mediocre LLM because what's making the difference isn't the LLM. It's the framework that I'm building around it, the instructions that I'm giving it. I'm essentially taking my like decades of vulnerability research and reverse engineering knowledge and I'm sort of like distilling it into an LLM prompt.
And that's the part that your average script kitty or your average nonvulnerability researcher doesn't have, I think.
>> Yeah. Doesn't have yet, right? Like I feel like these harnesses are gonna get like the models are getting exponentially better. And like if the harness is the magic, it's it's not like the harnesses are going to stay secret behind closed doors, right? Expo raised another $35 million to extend their series C today. Expo Expo's secret sauce is their like agent orchestration and harnesses, right? to like do AI pen testing, but I don't think that harnesses are going to be a defendable uh moat. You know, I don't know.
>> What a lot of people are thinking is that Claude Mythos is just an LLM. Like, it's an LLM that's so good it can just one short exploits, which I think is very very unlikely. If we look at the trajectory of large language models, they have improved like drastically on each iteration, but it's not the kind of improvement where they're like gaining world models or gaining an understanding of the underlying data they produce.
they're still just sort of regurgitating text. So I think what really makes Claude Mythos Claude mythos is they probably built a massive framework around it that does what I've done but for every class of exploit every class of operating system they >> um so this is kind of what I was saying right is they talk about he's using framework harnesses whatever right um yeah like in the public research they talk about how like especially with the Firefox phones of like the harness that they developed to feed the Firefox code to mythos. I do think that's part of the truth and I also do truly believe that the model got that much better at certain parts of this as well. Uh so yeah kind of an interesting well which one is more important right as we see with you know this research uh like people can kind of muscle in the commercial models to do similar things but I still think there is a giant capability jump in the mythos level models >> given it sort of all of the right harnessing and all of the right frameworks to work well within every single class of vulnerability. Um, and that's what I had to do here. Like, if I log into the claw UI and be like, "Find me a zero day privilege escalation in Windows." It's not going to know where to start. It's not going to find anything. Whereas, if I build this multi-step system, I can actually get it to do that. LM is not teaching me exploitation. I'm teaching it exploitation. So, that's kind of the big difference. I'm not trying to make me sound like I'm some amazing like top tier vulnerability researcher. I'm just pointing out that a lot of skills actually go into working with these LLMs. I don't think that someone without significant vulnerability research knowledge could just pick up an LLM and build a system like that. So there is a barrier to entry and anyone who is going to do this is going to already know how to find these vulnerabilities. Like I did not just gain the ability to find >> Yeah. but at scale, right? This is like what we're talking about is like Yeah, but could you have looked through the amount of lines of code that these models are going to be able to look through over the next couple months?
You know, it's like yes, humans can find remote code execution and can't stare at millions of lines of code and reproducibly do the pattern recognition stuff like all the time. And by the way, fuzzing takes [ __ ] forever. Like we've had fuzzing forever. We've had automation in this. It's called fuzzing.
And it takes [ __ ] forever to like muscle a fuzzer to like get the right reachability and crawl through all the different code paths for something like Chromium, right? Which, by the way, I ran out of hard drive space on my Mac Mini yesterday while I was trying to do something pretty simple and I was like, what is taking up all this thing? And it was my vulnerability research like stuff that was building Chromium from source and [ __ ] fuzzing. Yes. while I was writing my own fuzzers and [ __ ] with Chromium and I literally chewed through my hard drive. Like it was like 300 gigs of Chromium and fuzzer [ __ ] that I didn't know was happening in my vulnerability research terminal. Um, and the and similarly to Marcus's hypothesis of why he started was why I was doing that kind of vulnerability research was because I am not a skilled finds in browsers guy. Like I'm an ABSAC kid. So like I could find cross-ite scripting all day. I could find SQL injection all day, right? But like I I I can't and like haven't ever been able to like find a like Chrome bug.
So I was like, "Oh, it'd be really interesting." It almost in the same concept of what Marcus is talking about here is like, "Yeah, I'm like a skilled cyber security person, but I'm not a skilled browser vulnerability researcher. So, wouldn't it be interesting if I could muscle this thing into, you know, doing all this stuff uh for me?
Um, I burnt a few weeks worth of my Claude Max plan on on this idea. And then Mythos came out and I was like, "What the [ __ ] am I even doing? Google is running Mythos, so there's no way I'm going to find [ __ ] faster than them.
They're in Project Glass Wing with the better toy, right?" So, I was like, "All right, I'm burning weeks worth of my Claude Max plan to try to find bugs that I'm going to submit and they're going to be like, "Cute. We found this with uh with Mythos this week, so it's a dupe."
Find zero days in drivers. I've been doing this manually for years. So, it was my knowledge of knowing how to do this already that enabled me to automate it with an LLM. And the same is going to be true, I think, for anything Claude Mythos, anything that gets built in the private sector. It's going to require expert knowledge to build out this system. I mean, for now, right? I like I don't think these harnesses or these frameworks are defensible, you know what I mean? Takes one person putting all the [ __ ] that Marcus just talked about on GitHub and and that's it. Toast, right? Existing Opus 47 would be able to replicate these harnesses and frameworks if if they were just out and they're going to be out. Like there's no way that there's no way there's no way that people are going to keep this kind of [ __ ] under wraps. So in order for like hackers or criminals or script kitties to get access to these systems, someone has to actually build this pipeline and hand it to them. And can I assure you that no one's going to do that?
Absolutely not. But anyone who is going to do that is likely going to want to sell the exploits. So it's not like exploits are suddenly free or they're suddenly like $4 a piece, which is a really weird narrative I've seen a lot of people running around with. I don't know where it came from, but everyone has been like, "You can generate zero days for $4 now." Like you $4 a piece.
Yeah, I don't know where the $4 thing is, right? I I my I think that VMS are going to get cheaper. Like we saw a what previously would have been a $40 million iOS exploit kit just sitting on a Chinese crypto site in a watering hole attack. That is not how you use something that you value at $40 million, you know. So, either they didn't know what they had or this is the start of people caring a whole lot less about the value of an exploit kit.
>> No, you can't. It's complete nonsense.
Um, but anyone who was to build a system like say the one that I've just built would probably want to sell the exploits at a markup. They're not going to sell them at whatever the tokens cost. And the tokens cost a lot. Like I actually ended up falling for local LLM propaganda as a result of making this video. I think we're at like $800 in spending in just a week. And I was like, I I can't keep spending this much money.
Like I'm >> Yeah. Yeah. Right. Like the local LLM stuff. So even in the Neil's post, he talks about how he's able to do it with this open uh GLM5.1 openw weight model, but he's still running it on a hosted box because you still need some serious hardware to run these things. But even in like an open um open router or whatever the hell those those things are uh that that you can run these openweight models on, they they still cost money. They just cost like an eighth the uh the API cost of the frontier models, right? Um does he talk about the open weight one?
But they're much less efficient. So I think in his research since they the open models just suck compared to to you know modern frontier models. So I think that he said the openweight one since it's obviously so much cheaper but it's almost equally as less efficient. He was falling right around the same as sonet costs uh with with his open model.
But I mean how much longer is that true right? How much longer can we, you know, how much longer until we can run something that operates at current Opus 4647 capability uh locally? Is that next year? Is that too farfetched to even assume that next year we're going to have Opus 46 capability without needing to use the Frontier models? I don't know.
I I don't think we're talking decades. I don't think we're talking never.
Hey, Night Cry. Sorry. Uh, I saw you chatting. Yeah, we talked about the PAO one earlier and good luck with the move.
>> Not making any money with this. Like these vendors don't have bug bounty programs. Like there is no return on investment for anything I'm doing unless I start selling explosive criminals, which I promise I'm not going to do. Um, so it's like super super expensive.
>> Yeah. Yeah. Yeah. This is another good point, right? This is why I was pointing my vulnerability research at stuff with big wellestablished bug bounty programs because I was like, "All right, I want there to be an ROI if I'm burning through my Claude Max plans on this.
What's the what's the the incentive?"
You know, this is what people keep talking about. Oh, they found, you know, Mythos found this 27y old vulnerability.
It's like, well, how many people were looking at that? How many people were incentivized to look at that uh over the last 27 years? You know what I mean?
Like it's not like OpenBSD was paying a million dollars for these rceensive and I'm like wow like this costs a lot more than I expected and if I was going to be selling exploits I would probably be selling them for close to the price that they already go for pre AAI. But like let's think about the most extreme case which is I just go and I post this onto GitHub. I just say hey everyone I built a nice user interface that you feed drivers into and now pop zero day exploits go have at it. How much damage would that cause to security? And I think it would be a lot like when fuzzers came about like fuzzers was like a one of the first automated ways to find binary uh vulnerabilities. And the problem was that because most fuzzers worked in the same way, most fuzzers would find the same vulnerabilities. So this system is like semi-net deterministic, but for most intents and purposes, if you put the same driver into it, you will get the same output.
>> Yeah, this is I mean this is my my point with project glasswing, right? Everyone Mythos and the super hacker escaping the lab [ __ ] It's like no, it seems like mythos is just really good at finding certain types of vans in certain type of code. So it makes sense to do something like project glassing where we just try to [ __ ] burn through all of that, right, as fast as possible. run through all the browsers and operating systems like old CC code and [ __ ] like that. Uh firmware on appliances that are everywhere. Yeah. Like makes sense that we're just going to like run everything that we can in that like kind of shaped code and it's going to find it's very good at finding the same kind and shape vulnerability in those bits of code from what I hear.
Right. and then like its usefulness tapers off into like modern code framework land, right? So it's again it's not the like point and hack machine. It's the very good at finding certain types of vulnerabilities at scale in uh certain types of code which happens to run everywhere on the internet, right? So >> and essentially what would happen is security researchers would use the system, they'd find vulnerabilities and they would either make detections or they would report them and get them fixed. Uh meanwhile the criminals would obviously try and exploit them for financial gain. Uh but because there's only a finite amount of vulnerabilities this system can find. They would just get used up very quickly. People would just feed their drivers through it, patch them, and there would be like a very sharp spike in the number of zero days around. But then that spike would immediately drop off and then without me changing the system, that system is now done. It's not going to magically find more zero days. I don't think the sky is falling. I think Claude Mythos is going to end security. It's actually most likely going to lead to a significant improvement in security over time. Now more people have the resources to find and fix vulnerabilities than before.
Because as I said, in order to build the system, you need vulnerability research and exploit development knowledge.
Whereas if someone with that knowledge builds the system for you and they say here like go and uh find the vulnerabilities in your code and fix them that is providing a new capability to those people. So now you can have any software engineer able to find and fix vulnerabilities in their code. But of course that's going to depend on us actually providing the capability in a meaningful way. But that's where we get into a catch 22 because these models still cost a phenomenal amount of money to run. So all we're really doing is replacing vulnerability researchers with LLMs. So someone still has to pay for either a vulnerability researcher or an LLM. Maybe the LLM makes the work a little bit cheaper but it doesn't make it zero. It doesn't mean that an open source project with no >> yeah I agree no budget who couldn't afford to hire a single vulnerability researcher now magically has $10,000 worth of tokens to feed into an AI. So I think there needs to be a discussion about finding a way to socialize the cost so that we can all donate towards these indie developers and open source projects and small to medium businesses so that they can secure their code because that's where the gap actually is. Sure there are vulnerabilities in Microsoft and Apple and Google products but they have the resources to find and fix them already. Sure they're a little bit backlogged but they do have the resources. Whereas all of these people over here, they have no resources to fix vulnerabilities at all. And they're completely reliant on people reporting bugs for free. Like they can't afford bug bounty programs. They can't even afford to give out a free t-shirt.
Whereas if you look at a company like say Apple, they offer like $2 million for certain vulnerabilities in their platform. So that sends all of the vulnerability researchers and all of the LM resources over that way when really they need to be over here because sure like vulnerabilities in iOS do have a massive impact. So do vulnerabilities in the open source software that we all rely on and those are just as critical, but they don't have the resources to fix them. Now imagine if a Linux kernel remote code execution vulnerability got out. That's like 99% of the non- desktop workload. That's all of the hosting providers, the AI providers, uh like the big data providers, cloud, like everything runs on Linux. Only desktop systems tend to be Mac and Windows.
Everything else is Linux and Linux is not a gazillion dollar company with an infinite security budget. So that's where I think we really need to be prioritizing resources. But anyway, those are just some of my thoughts.
>> Yeah, I I think I think Marcus is right in like all of the technical stuff and and everything that he talks about. I think he's just like a bit too like anti- I think he sees so much of the um of the AI like hype and like he has such a negative reaction to it that I think I think he kind of overcorrects into the like you know no this ain't [ __ ] land right. Um yeah Colin I was waiting to get to the end of the video. Claude is like currently having like their developer conference and so they are like dropping announcements like my Twitter is blowing up right now. So they have Claude manage agents that just came up. What else? Um God I had I captured like three big things.
Um they partnered with SpaceX to increase their compute capacity and they said within the month and so they're upping their usage limits. We're doubling Claude Codes 5hour limits for Pro Max and Team Seats. [ __ ] enormous enormous doubling the five hour limit.
I pretty regularly chew through my 5-hour limit. Uh removing peak hours limit restrictions. So now it's not just the overnight stuff. Substantially raising our API rate limits for Opus due to being able to partner with Space X's uh new data center. Give us 300 megawatts of additional capacity to deploy within the month.
Wow. Wow. Wow. Wow. Wow. Wow. Wow. Huge.
Huge.
Huge. Are you on a pro plan, Roman?
These are API tiers. I don't use the API.
This is huge. Huge. Huge.
What else? Claude is on a tear right now.
What's this manage agent thing?
Multi- aent orchestration and web hooks.
Dreaming reviews your agents past sessions, extracts patterns, and curates memories.
Outcomes let you set the bar for quality. You write a rubric. A separate grader checks the output. Oh, that's huge.
Uh, multi-agent orchestration lets a lead agent delegate to specialized workflows.
Huge, huge, huge, huge Should probably use Sonnet more than Opus since you're on the $20 one.
Yeah. Oh, work won't give you nothing else.
You're actually in Greece or is that a joke? No, it's his real name.
It's his It's his real life name. Roman.
Roman meet Colin. Colin meet Roman. Um, what else I want to read?
Dreaming is available as limited research preview.
No.
Um I think also Rome in Greece is also a funny mistake there, but I think that's what he means.
Building self-improved agents with dreaming.
This is in preview for developers on API plans, so I can't even apply there. I'm on a max plan. deliver better outcomes with outcomes.
This is huge because this is like part of what the value that I get out of PI and um and uh Maestro tell the agent what done looks like.
Didn't doesn't codeex have like a goals thing codeex slashgoal that I've seen people Use like like have you guys seen this like in codecs they have this slashgoal thing I've seen people start to use.
It's working on my project without interruptions because you define the goal like the end goal and then you tell it to go and it doesn't stop until until it gets there.
Yeah, I'm not a I'm unfortunately can't use codeex as much as I would like because of the open eyes [ __ ] like plans and how they do everything.
I got to like start a brand new [ __ ] account in order to use like the 20x codeex plan.
Man, they are launching [ __ ] right? What else have they just launched at this conference that we haven't talked about yet?
Yeah, of course I'm going to Defcon. So, this outcome thing is super interesting.
What else is in managed agents?
Agents do their best work when they know what good looks like. Structural framework, a presentation standard, or a set of requirements need to be met. So yeah, this is super similar to their goal Codex's goal thing. Handle complex tasks with multiple agents. This is like very pi and maestroesque.
Multi-agent orchestration lets one agent coordinate with the others to complete complex work.
All agents share the same container and file system, but each agent runs in its own session thread.
Threads are persistent. Each agent uses its own config.
What to delegate? Multi- aent coordination is best suited for complex tasks.
Patterns that work well.
Parallelization. Fan out independent subtests simultaneously.
Specialization and escalation.
Consult a more capable agent.
Super interesting.
Oh, and you can Holy [ __ ] You can Whoa.
So, per agent you can define a model.
Oh my god. So do you see where this is going? So you can define an outcome, right? And then the outcome could use a multi- aent orchestrator and you could use the cheaper models in the multi- aent orchestrator that then have to report up to the better models that are more expensive in order to meet the outcome.
That's [ __ ] huge.
I know a lot of people that were trying to do this, but they were like really hacking together their own harnesses to be like, "Oh, Sonnet, you spend a bunch of time doing Sonnet things, and then we're going to call in Opus when we like really need to get it done." But it seems like this is built in now.
Wow.
Wow. Wow. Wow.
Uh Pi, you know, pi, we've talked about pi a bunch.
personal AI infrastructure Dan's project. He just put out 5.0, which is I've I've spent like a day and a half trying to set 5.0 up.
Um it's been a bit of a lift, but but yeah, this is part of their whole thing, right? It's like bring your kind of whole personal context stack to everything that you do. I just find that I get a better outcome because of all this.
But that's what I'm saying like Claude is doing very high things.
So you define the series of conditions for it to pass among various sub aents up to including opus. Yeah, that's what it sounds like.
I'm like I'm smashing two features that they just talked about together. Like the outcome is one thing. You can like define an outcome, but then in this it's like a multi- aent coordinator. So why couldn't you chain the two? I you know I I feel like you probably can change the two uh chain the two.
Um, so this is super super interesting.
Um, all right. We're in the docs now. Hold on. Let's go back to the announcement because I don't even know what's if I keep scrolling. I don't even know what's new, right?
So, manage orchestration lets a lead agent break the job into pieces.
A lead agent can run an investigation while sub agents fan out through deploy history, error logs, whatever. Right?
And so, if you're specialized sub agents, they could be on sonnet, which has a whole different usage bucket, right?
What teams are building? Harvey is a um legal agent company.
Manage agents to coordinate complex legal work like long form drafting of document creation. With dreaming, their agents remember what they learned between sessions. Completion rates went up 6x.
Netflix's platform team built analyst analysis agents uh that process logs from hundreds of builds across different sources.
I mean, it's just marketing at that point. Still, that's that seems super super cool. What else is on their blog that just came out?
Oh, yeah. Did you see the financial financial services announcement? They made finance agent templates.
I've seen a lot of the memes that are like tech guys handing finance bros cigarettes like oh your first time first time that something that you're good at got just built into Claude.
They're just shipping at like kind of a legendary rate right now, huh?
Like a chain of command or leadership structure. Yeah, I kind of agree.
Be cool to fart around with this in parallel to you guys. Yeah, I think I mean Claude is probably one of the more important things to not ignore and not, you know, make sure you get experience with as this stuff starts to spit out.
Um, all right. I think that's about it for the day.
Thanks for hanging out, peeps. Subscribe if you haven't.
Um, venew.com as always if you haven't.
I appreciate you guys. Sorry that you got to mopey Matt in the beginning of stream about burning out on this [ __ ] Uh but just keeping it real with everybody.
Um and uh yeah, we'll see you guys tomorrow morning.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











