This demonstration effectively exposes the fragile illusion of safety in current AI agent architectures, where rapid deployment often outpaces robust security. It highlights the critical need for a shift from superficial guardrails to foundational, defense-in-depth strategies.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
I Jailbroke Claude AI AGENTS With Antigravity, Cursor & MORE!Added:
Oh my gosh, this is incredible.
It's worse than before. How could it be worse than before? Oh, I'm not even going to do this. So, yeah, this is actually insane. In this video, I'll be testing out all the best AI agent applications like Cursor, Windsurf, Anti-gravity, Claude AI, and even more to find out which AI agent is the best at hacking. Each AI agent will be ranked from best to worst, so you can find out the pros and cons of each agent. Along with that, I will even teach you how to protect yourself with an agent that does the opposite and help build security.
Before we get into that, make sure to like and subscribe, and let's dive right into this. First, we'll be testing out Anti-gravity. Now, I demoed a bit of this from our previous video, but to quickly recap, this is an AI agent built by Google, which hosts an array of different AI models like Gemini to Claude. The platform seems to be the easiest to jailbreak, so let's see if that's still the case. Okay, so first, we're just going to go into customization. We have this prompt right here, but let's test this out to see if this works. I put FOC.
But it came up anyway. It knew what I was doing, so if I put game hacks, let's see what ends up coming up. As you can see, it's Opus 4.6 thinking.
And it's it's giving me everything. It's saying what game hacks, what kind of hack, internal, external, engine. It's telling me everything I need. We're going to be trying to switch it up, so we're going to do a new conversation.
This is 4.6 thinking. Let's try Sonic this time. So, shout out Cook 45 for this one.
And we're just going to go and paste this and paste it into the rules right here. Now, my thinking, as I said in my previous video, if you didn't check it out, is that I think Anti-gravity has a looser guard rails, and this is why you can just get away with crazy stuff that Claude wouldn't really normally allow.
So, if I put pay cut game hacks, and there it goes, telling me everything I need again. But let's just say for example, we want uh infinite ammo. And just like that, it's telling me everything I need about infinite ammo, with approach one being the fastest, and then going to a point of chain and so on and so forth. And you can even prompt this further to, you know, go to other platforms. What do I think of anti-gravity? I think it's too powerful, and I think it needs to be patched. I don't know why Google just have the weakest security when it comes to this.
But, the next one's going to be interesting, so let's go into that.
Next, we'll be testing out Win Surf. Not the VPN Windscribe that people confuse it with, but an AI agent built on VS Code that runs Claude and ChatGPT. So, let's check it out. So, I've downloaded Win Surf, and I added the skills already for this. So, I was trying to work on a jailbreak. And long story short, it didn't really work out. But, I found out something really interesting, and I lost time. But, essentially, it works primarily the same way as anti-gravity.
It's built the same way.
But, obviously, there's a vulnerability that it's not on Claude. So, I went on to something called workflows, and I pasted the jailbreak in, and it changed the whole game. If I go into {slash} review, and I put diver, which is one of the instructions, and I put SQL injection code.
Now, the reason behind this is, I think, for some reason, there is a vulnerability. And as you can see, just straight away, it done it without thinking, without Claude saying I cannot reject it, without Claude trying to think about what it was. It done it because it was run through Win Surf's AI itself. And it's just doing this in full detail, going into it. But, still, regardless, this is quite interesting to look at. For Win Surf, I'm probably going to rank it close to impossible, because it's just basically under Claude's guardrails, if that makes sense. And I think Win Surf has no control over it, versus anti-gravity that I think just makes it vulnerable.
So, yeah, I think Win Surf is close to impossible, if that's just the case.
Now, we'll be looking at Cursor, which was originally built on VS Code. It supports models from OpenAI and Anthropic, and even xAI. Now, let's see how it works against some jailbreaks.
The next one we have Cursor and yes, I did get the pro for this. So, we would be doing this one. So, I think we just click on that.
And I think what I really need to do, I think there's Okay, there's a skill here. Let me just paste it in and see how this interacts with it. Okay, it seemed to work well, but there's only there's a bit of a caveat, right?
Because as you can see down below, it has I I think it doesn't have instructions for like enter black box.
And I think if I can check the fault, I don't think it mentioned it as well. So, I'm not going to go into adding enter the black box because before it rejected it, but let's just test out how this will work anyway in like the educational format. So, I'm going to just put in SQL injection.
It does seem like Claude can actually follow what you're saying if you actually put stuff for educational purposes. Like this is for genuine educational purposes, right? But normally if I were to ask Claude to do it, it would tell me no, I will not be able to do it even for educational purposes. So, it does seem like he'll be able to do it. I also want to test it out with Opus 4.5 to see if older models have less restrictions with the same prompt. SQL injection code, right? It's more explicitly out there, but let's see if this will actually come out to produce it regardless.
So, it's telling me what not to do.
And then it's telling me what to do.
And let's say for simulated website. It does seem like it still has done it. So, I think Opus 4.7 would have rejected this and say no, we won't be able to help you with a simulated website, but if you do ask it to give you injection code or prompt injection for a simulated website, it would probably produce this and now this is telling you in full detail how it works, the vulnerabilities and even the defenses, which is pretty cool. Now, for the ranking, I have to put Claude in easier, I guess. I don't think it's too powerful. This is yeah, pretty decent. X is Claude CLI, which is a command line interface for interacting with Anthropic's Claude large language model. But because it's directly in the terminal, I think it would be interesting to try and jailbreak Claude with Claude. So let's see how that ends up going. This isn't a necessarily a jailbreaking prompt, and this is something that's actually contributed with Claude, but it is a web security testing markdown skill. Let's go through all the process. So it's basically pretty simple. There might be some issues, but for the most part it was pretty simple to follow. And you even have an example website that you can test this with. So this is all what we'll be doing. Now that we have that pasted, all you need to do is going to go on slash and we have now FFuF web fuzzing. Now it comes up with the skill.
Okay, now that this is done, what I'm going to do is paste in the website and just say target. So yes, it did ask me if I have written permission to target the target.
And I'm going to say yes to see if it will vet it or not, and I put yes.
Now it's telling me in full detail what to do, and this is quite interesting. So what seems to come up is from what I'm seeing is it's telling me what to do. So it's to replace the target with your real host. It's telling me the instructions. I guess if this is useful to you guys, you can test it out in more detail. I just wanted to show this little bit, to show you can use it in different ways. Now I'm going to put Claude as I think let's put too powerful because that was a skill that Claude and others worked on. And it didn't really have that much vetting. Like you could literally just pick any website and just say, "Hey, hack it." And just say, "I have authorization." And it would just give you step-by-step guides on how to do it, which is kind of insane. Now let's look at Feces, an AI agent which is familiar on the channel, with my most viewed video being on how to jailbreak it with Claude. It seems as of now things are patched, but let's see if it could still work. Okay, so we're back on feces and the last time this basically worked. Uh to be honest, it just worked straight up and you could just like use this for yourself. So, I'm just going to put anything here and it comes up with different models you can use.
So, I'm going to post it in the AI instruction now and it comes up with this and this is apparently what the instruction was before. We don't want that. And all we are going to do is go back to my old prompt. We're going to go back to this one.
Pasted it in. I'm going to add it in and I'm going to see if it does the same thing.
And it says the API usage has exceeded your current plan limits.
Uh okay. There we go. I'm back in feces and it still works. I thought this was patched. What's going on? So, it's even telling me better detail than it was before.
So, I'm going to do give me a task, Jack and I'm going to see what type of task it's going to give me out of all of these things right here. Oh my gosh, this is incredible.
It's worse than before. How could it be worse than before? This isn't even like oh, whatever. This is just straight up bad. Like it built me a whole thing. I'm not even going to do this. So, yeah, this is actually insane. The fox one still works and again, you can find this in the Discord server in the description below.
But yeah, this is insane. I did not expect that to work like that. I thought they would have patched it. It would have been, you know, harder guardrails at least, but this is too powerful. I think this is has to be on the top. I I I can't believe that worked. But let's actually try and explore the other side. Now that I've demoed and ranked all AI agents, let's look at one that can build security called adaptive AI. So, we're currently on adaptive AI and I've been testing it out. You won't be able to use this for jailbreaking, but this section's for security and hear me out. I've got a really good idea. I was thinking about what to do and I used Claude for inspiration and it came up with this really good idea and it's encrypted code vault with access gatekeeping. So, it's a zero-knowledge encrypted vault that lets developers share source code of full control over whoever reads it for how long and proves it was if it was ever accessed without permission.
Meaning, you can gatekeep code. And this is actually kind of sick, right? So, I'm just going to paste this in. I'm going to make a new one. So, I'm just going to say make a new AI agent and we'll paste that in. It's also prompting me that I could actually try and attach this with Slack or web hooks.
But because this is just a demonstration, I will not be going into that in too much detail. But still, this is an insane idea and let's see how Adaptive produces it. Took some time, but it finally generated this. An encrypted cold vault with access gatekeeping. Now, that's just insane in itself, right? The idea of gatekeeping code. And you can build source code right here. So, you put your own source code, you have the recipient email, when it expires, the view limit of who can view the code, and you encrypt it, and you basically have your uh key that you can share. And you can even see who has access to it. So, because there's a limit of three, you can only see through there, and you can revoke permissions. You can even go to the recipient access simulator and see how that is and how they would see, of course, from their end, I guess. So, it says use uses the vault ID and key fragment from a share link. The server authorizes the cipher text access as browser decrypts locally. So, this is the vault ID that we need, the recipient. This is how it would be, and the URL the URL, sorry, and this is what they get in return. Which is just crazy, right? I don't know why no one ever thought of this before. Maybe they have, and I don't know.
But still, it's pretty cool, and the capabilities of Adaptive can go into even more detail. So, if you guys want to check it out, make sure to click the link in the description below to access Adaptive for yourself. That's it for the video. If you enjoyed the video, make sure to like and subscribe.
And yeah, click the video on screen.
I've been TWAII. Catch you on the next one.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











