AI agents can significantly enhance application security workflows by automating code analysis, vulnerability testing, and security report generation, but require human validation due to non-deterministic outputs, hallucinations, and potential security risks like prompt injection; effective implementation involves integrating AI tools with security platforms like ZAP through MCP servers, providing proper context, and maintaining human oversight for sensitive operations.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
From Manual to Automated: Building AI Workflows for Application SecurityAdded:
Okay, we are live. Um, uh, nice live. So, yeah, thank you very much for joining everyone. Um, uh, my name is Ali and I'm going to be taking you through this workshop. Uh it's going to be quite handson.
All right. So, we're going to be uh doing a few interesting things, but uh I'm going to expect you to follow along and let me know if there's anything any issues that you're having uh with any of the tools that we're going to be using.
And um again, this is just like almost like a starter really, right? So, um I'm going to be sharing some ideas on things that you can do beyond uh beyond this session, you know, projects that you can experiment with. So, that's really the goal. The goal is to get you experimenting with different things. So, here is just the starter. We're going to be doing mostly uh basic stuff.
So, um, before we get started, I'm going to be doing a brief overview.
Um for anyone that may be unfamiliar with um some of the technologies uh geni technologies trying to share my screen Oh, actually I can just share this slide directly. Just one second.
Okay. Um, is the slide visible now?
Yes. Yes. Yes. Yes.
>> Okay. Cool.
So uh today we're going to be looking at uh AI and agentic systems you know some of the risks around AI security and how to use it safely and we're also going to be talking about you know the ethical side of things um how to use it as a professional and also going to be touching a little bit on uh cloud-based versus local uh AI models and we'll look at uh AI assisted coding and workflow automation.
We're going to be using uh Vonbank to be doing some live security testing. We're going to integrate um AI tools into BB Suit and Zap.
And we're going to look at, you know, what does a workflow look like when uh when we're talking about, you know, AI powered security testing, vulnerability analysis and stuff like that. You know, we look at uh uh practical use cases in terms of apps.
Um, and we'll try to keep it as much as possible, you know, relevant to to to the real world. So, I'll be bringing it a little bit of my experience into play as well.
So, before we even go into this, like I wanted to flag this uh difference, right? Cuz we use AI a lot. um we use the word or or the acronym AI but most of the time when somebody somebody says AI uh nowadays they actually mean generative AI I mean AI has been part of our lives for so many years already uh including in cyber security world right um machine learning models have been part of our detection tools behavior analysis you know fraud detection any kind of a good malware antimalware solution has some form of you know AI integrated with it because you can't rely only on signatures but today we're going to be focusing on generative AI powered by large language models but I just wanted to to uh you know highlight this difference um so you know one of the benefits that we can some of the benefits that that we can get with AI assisted coding is it can speed up a lot of tasks. This is something I've seen like in my day-to-day. Um you're able to do more, you're able to do things a bit faster and you're able to automate some of the repetitive tasks. Um uh and also uh if you know what you're doing, you can prompt the the AI agent to do some of the things that would have taken you longer to do and then allows you to focus on like higher value work. You can you can focus on like some of the strategic aspects as well, right? So this is not something that uh that should replace your normal uh thought process. It's not something that should that should replace your own intelligence frankly, right? It is something that should augment you the same way or similar to how um uh existing tools that we have augments our uh our process.
So again when we talk about generative AI um there's some there's some confusions around you know what what is where is an AI model what is an agent what is a coding harness there's just so many terminologies been thrown around uh and given the umbrella term AI right so just briefly the model is basically the uh the generative AI itself is the large language model. It's been trained on you know billions and billions of parameters um from the open internet.
Uh in some cases some companies have been caught using parated content or use using content without permission. Um but by and large it's it's based on a lot of content that's already been published that's already been made by humans right so what it does is it takes an input it uh analyzes it based on you know based on its training data and it gives you an output right that's pretty much what it does kind of like uh auto what's it called predictive typing when you're typing and it predicts the next word that you're about to type uh you know based on language and based on your own like past activity. So this is kind of like that but on on steroids.
Um you know tragic came out the world went crazy uh with you know in terms of like the cap capabilities of uh of large language models. But what makes them very useful when it comes to uh any kind of technical work or application security work is when it's wrapped around uh or when an agent is wrapped around the LLM, right?
Look at coding for example. Um, if you wanted to use charge to write code, you would have to write the code. You would have to prompt it in the charge UI, copy and paste the code into your environment, run it. You know, if you're trying to troubleshoot any errors, you would have to go back and forth.
But with these AI agents, the agent sits on like within the environment that it's being used and it connects with the large language model. It also connects with the tools um that the agents can use and it kind of orchestrates you know everything uh everything around the use of AI and you can also have like automations that integrated with these agents and tools as well.
Uh example of an agent would be something like uh cloud code, openAI codeex uh open code which is open source uh client also open source or uh GitHub copilot.
So these are all uh agents that you can use to to write code but at the same time they they're they're built with writing code in mind but they they can also be very useful when it comes to application security because a lot of the things that we do in application security um have to do with analyzing code but also you know making requests um analyzing attack paths and things like that. So these are tools that are also useful uh when it comes to application security work. You can use it for code review um or you can use it to analyze findings. So for instance, if you run an automated tool, you can use this uh to to analyze uh the outputs. You can use it to orchestrate different scans, access to reporting and uh even with offensive security tasks as well.
So let's also talk a little bit about the risk, right? Um I think one thing that you need to understand about uh generative AI is that it's not deterministic in the sense that you can you can run the same exact model, the same parameters, the same prompt uh into an AI model and it would come back with completely different results every time. Right? Obviously there are things that you can do to to reduce this uh kind of variance in the outputs. But again depending on your task that can actually reduce the efficiency as well or the effectiveness of the model.
Right? This is by design. Another thing uh you might have heard of as well is hallucination. Right? you can have because the the the the LLM itself has no understanding or intelligence or to know like what's actually right or wrong.
It's literally just taking in inputs and uh based on the the training data giving you an output. That's like an oversimplification, but that's pretty much how it works, right? So, you can't really like you can't really force a deterministic output out of an LLM, which means it's it's kind of unpredictable in what it does, right? Which means that um there's a security risk there, right?
For instance, if you look at uh prompt injection where you know, even if an AI already has like a system prompt and it's already been given lots of instructions on what it's supposed to do, what it's not supposed to do, you can still pretty much trick it into doing something that's against its own uh instructions, right? or you know if you're trust too trusting of the output of the of the generative AI that could also be a risk because you know you you can make decisions wrong decisions based on uh wrong information.
Um and then there's also like insecure AI integrations. For instance, if you integrate a tool with your AI agent that's malicious or compromised, that could be a big disaster as well.
Um, there's also the concerns about if you're using cloud models, right?
uh depending on the organization that you're working for, you might be or or depending on the kind of work that you're doing, you might be violating some form of privacy guidance or compliance by using that AI model because again, it's going into the cloud. You have no idea what they're doing with that information. Um they could be using it to train their future AI models.
um it could get leaked somewhere else.
You know, you don't really you need to be careful with how you're using it.
With personal projects, it's probably mostly fine, but if it's in a corporate environment, make sure that you know there's a there's a proper policy around it and you're you know, following that policy. Um usually they have different terms uh regarding terms and conditions regarding what they do with the data depending on you know if it's a personal account or it's a corporate like enterprise account.
So we need to make sure we're validating the outputs as generated by AI. we have human uh supervision. You know, when it's calling sensitive tools or running sensitive commands, make sure that you're actually reviewing it and not just pressing allow, right? Uh and just understand the limitations of the model.
Understand what it's what it's good at and what it's not good at, right? And try to use uh tools and environments that you actually trust. A good resource that you can look at is uh the OASP geni security project. Um it's good to just like read through it. Uh it's a good resource to understand the risks around uh what's it called around generative AI and yeah I've mentioned some of this before. We have different um use cases with threat modeling, vulnerability analysis, security testing, reports, generating custom scripts and and stuff like that. Um and some of the tools that that we want to use today is uh the Burpuite MCP server. This one requires you to have BB Pro. So most likely we'll focus on the Zap uh version because Zap is open free and open source, right? So Zap also has an MCP server that they've developed that you can use with um AI agents.
Um you can also integrate it with any kind of any CLI tool as well.
So, um, again, I talked a little bit about this, but make sure that when you're using, uh, any AI tools or anything you're doing really, uh, but I need to, I guess, emphasize it here as well, you need to make sure that, uh, you're respecting the scope and authorization, right?
With AI models, there's part of the work that's going to be done uh, in the back.
You need to have like enough guardrails in place to make sure that no kind of exploitation is being done outside the scope of your uh whatever engagements that you're doing. Right? So, we're going to look at how to how to uh enforce this as well.
Uh again, we talked about customer data and organization data that still applies here and validating issues, right? Like one one thing that uh generative AI is always going to do is try to blow things out of proportion. Right? If you ask it to do a particular task, um if it finds something that even remotely looks like a potential vulnerability, it might report that as like it's a critical, right? Or it might be it might be a true positive but not something that's uh as bad as it claims it it is. Or it might have like the wrong reproduction steps. There's so many things can can go wrong here. But just make sure that um you are actually validating the results yourself before before reporting anything.
Um, so let's look at let's do some some hands-on uh demonstration here.
I'm going to share my screen.
Give me one second. I need to Um, are there any actually are there any questions regarding uh anything that we've discussed so far?
No, no one has asked any question yet.
I'll ask them to start dropping their questions if they have any.
>> All right, no worries.
Cool. Um, is my screen visible?
>> Yes, it is. Yes.
>> Okay, cool. So, one of the things that something we're going we're going hands on now, right? So, one of the things that I want you to do now is um if you haven't already, create an open uh open router account.
You can use anyone actually, but open router is Oh, okay. Let me see if I can increase the size.
Yeah. Someone said it's too tiny. Like they can't see the >> Yeah. I'll try to increase the size now.
>> Okay.
Uh, I hope the text size is much better now.
Cool.
>> Let's hear from them.
>> Okay.
Okay. Um, I think it should be better now, but let me know if it needs to be increased again.
>> Okay. Okay.
>> All right. So um if you haven't created an open router account yet, you can create it now and get your API key. So um Open Router has a few free models that you can use.
Um let's go to the pricing page. I'll show you.
Yeah. So, there's some 25 free models that you can use.
Um and then let's also install open code. I actually I'll send link Um, I've been I'm trying to send a comment now, but it's not working. I'll send it to you privately, so maybe you can send it to them.
>> Okay.
So yeah, open router is it just allows you to use different um AI models. You can sign up to uh anthropic cloud code if you want, but this gives you more flexibility in terms of the AI models that you can that you can use and it offers some free models as well. And then there's a open code as well which is a free and open source alternative to uh cloud codeex copilot whatever you whatever the case may be. So it allows you to uh it's not locked down to any specific provider. So cloud code is kind of locked down to entropic. Uh but there's a way you can do it to to use it to access any other models using the configuration file. But this one just straight up allows you to use whatever uh whatever you want to use really including open router.
Uh so you can download the terminal app or you can download the desktop app or the extensions on whatever coding model uh coding tool that you're using.
Uh but I'm just going to I think I already installed I haven't actually. I'm just going to install the the desktop version now.
Has anybody made progress with uh setting up open router and installing open code?
>> Okay, no comment yet. I'm going to let you know if anyone has yet.
>> Okay.
All right. While open code is installing uh the desktop one, I already have the CLI version uh enabled.
So I can actually open it in the in VS Code now.
>> Okay. So, someone said they have open router installed and they're currently on open code here.
>> Okay. All right. Nice.
Um, so I'm assuming a lot of you already have Zap installed, but if you don't, try to install it as well cuz we'll be using the Yeah, it's fine. You can you can just download the uh the even the desktop even the beta version of the desktop it still works.
I've used it before. Or you can use this the terminal version as well. That that's also fine.
Um so if you have Zap installed, if you don't install it, but if you have it already installed, you can go to the manage add-ons uh screen.
and then install the MCP integration. If you just go to the marketplace and search MCP, you can uh select and install it. I think I can't select it cuz I already installed it.
Let me see.
Actually, I haven't.
I'll just restart Z.
Go back to add-ons.
Yeah, the MCP integration has been installed.
So um if you haven't used any MCP tools before, MCP is called mod model context protocol. It basically allows you to uh expose certain functionalities from any tools to an uh to an AI agent.
So it allows you to um so if I go to open code for instance and start chatting ask it to show me all the uh available MCP servers.
I'm not sure if any one of them is has been enabled, but let's see.
It can basically find, you know, all the MCP servers that you have available. So, none has been configured just yet, I think. Um, and it can get grab information using the MCP server. You can run commands and things like that. So the Zap MCP server allows us to use different ZAP functionality like um the history, the context, uh triggering a scan and stuff like that.
So instead of coming into the UI or using the Docker to manually trigger a scan, you can just go straight into whatever AI mode uh tool you're using and call the Zap uh MCP. So, the Zap MCP is still in I believe alpha, but for the purposes of this demo, we can we can still uh we can still use it.
Uh what's the progress?
Has anyone got open code running yet?
So, uh another another benefit of open code as well is that it has like free models uh included as well. They won't be as powerful as any of the paid models, but you can definitely um you can definitely just start using it with the free models that they have included.
Um, so in the interest of time, I'm going to just get started and you know, if you get to any point where you need support, we can we can definitely go back.
Uh, so I'm going to open the I'm going to open VBank in my VS code. And I believe if I open open code here, it's going to go straight to Yeah, it's going to go straight to vom bank. So, one of the things that we can do here is um we can analyze the code base. But instead of analyzing the entire codebase uh which will take a lot of the context we can look at the we can look at a specific functionality within vombank.
So let's say you know analyze the transfer functionality.
So it's showing us um okay so after installing open code you can basically you can run open code within any folder but to make it easy if you have VS code or any other uh application you can just run it from the from the extension if you install the extension which is what I've done but if you have do you have zap installed does everybody have zap installed Okay, I'll send you the link to install Zap.
Okay, it's not going from my end again.
Okay, for those that already have Zap installed, try to install the Zap MCP server.
And if you don't have Zap installed, try to install it.
Um so yeah again I just sent like a simple prompt at the beginning to analyze the functionality within vom bank for potential vulnerabilities analyze the transfer functionality. So giving it this something specific to look at instead of asking it to look at the entire codebase which you know sometimes that might be what you want to do.
Um, so Justin Boss 4131, are you um are you using the default model that's come with open code or something else?
And what kind of uh is it the installation that's taking time or something else?
It might be like an internet issue.
So you can usually with any of these agents you can always go back to look at like the uh quote unquote thoughts process um and see like you know how it's decided some of the things.
So here it's found a critical vulnerability um which is negative amounts uh trans transaction or balance inversion. So basically with VonBank and this is something I've I've I've tested with V the actual Von Bank um application and I know it's true but you can basically send a transfer with a negative balance and uh sorry you can you can send a transfer with a negative amount and instead of deducting money you would increase your balance because it's only checking uh what's it called? If your balance is greater than or equal to the amount and then it would you know perform the transaction.
Um it's found it says it's found the race condition as well.
And the good thing with uh installing the open code extension instead of just using the CLI is that you can open like the specific file and it'll take you to the specific line of what it's found uh instead of having to manually look for the file that it's you looking at.
So you can easily use that to like validate the findings as well.
So it's found like a bunch of stuff uh which I think because the the trans transaction uh function has so many things to find even just within the transaction function it's found found a lot of things. Um yeah so that's that's one use case right you can use it to analyze specific parts of the codebase but you can also plug it into uh tools like sneak or sam grip if you want to run like deterministic static analysis before uh before running it with with AI because you still want to have that baseline of like uh predictable results before you add the the AI analysis on top of it.
So, has anybody has anybody been able to install the Zap MCP server?
Let me try and enable it from my end so that we can test it.
Just do a quick search cuz I can't find it here.
options MCP integration. Okay.
Yes. So, uh I opened it as a folder in open code. But what I've done is you can do that. It will have the same impact.
But what I've done is I because I didn't have the desktop one installed yet. It was taking time which is done now. I installed the I had the CLI version already installed. So I installed the extension in my VS code uh the open code extension in my VS code so that I could just open it here and it would automatically because I have vom bank opened in VS code it would automatically uh you know assign it as the current folder but if you've opened it as a project in open code that's that that also works think I can even do that now since my open code has installed uh var.
So regardless of where you're opening it, it still remembers the context from, you know, what what you've done uh elsewhere.
Okay. So, for the Zap MCP server, if you go to Zap and go to manage add-ons, um, I think I need to is there a way to increase the icon sizes of this? I'm not sure, but it's this uh icon.
And you can go to the market place to search for the MCP and then you can install it from there.
I've already installed it so it's not showing up in the marketplace. It's showing up in installed.
Um, if you want a textbased guide to installing and using the Zap MCP server, I'll share a link to that as well.
Cool. So, let's go back to Zap. Let's try and enable this. So, I haven't done this before, by the way. I've done it with the Burp MTP server, but I haven't I've never used the Zap one. So, we're we're learning together.
Uh, so tools, was it tools or options?
I said, yeah, install.
Okay. Options then MCP integration.
So if we go to options and search for MCP tools options and MCP MCP integration. Uh-huh. So, we can enable the MCP server.
Um, we can define which ports it uses.
By default, it's set it to 8282.
Um you can define a security key or generate a security key as well.
Uh and you can also say it should require the security key in the authorization header. How or you and you can also set it to secure only. So it rejects HTTP connections. I'm going to disable this and disable this as well because I'm running all this locally within a virtual machine. So I don't I don't see a need to uh what's it called? I don't see a need to do this authorization. But if you have Zap running as a service across multiple devices or across a network, then you definitely should use the security key.
Right? So that's why I've disabled it.
So now let's see how do we add an MCP server to open code.
So there's an open code config file and then under MCP you can you can add the server open codejson.
You know what? I'll try just uh Okay, I think I opened it in the desktop version. So, let's use that.
Uh you can try just using the AI model itself to add an MCP server. See if that works.
Um what system are you using? Are you using Windows or Linux or Mac OS? Yeah, you do need to install Java. Sorry about that.
I'm using Kali Linux in a virtual machine, so it already comes with Zap installed.
So yeah uh if you're using Ubuntu then yeah uh installing Java should be straightforward I think if you can search for how to install uh Java in Ubuntu.
I think it was 828, right?
Yes.
So now it's asking me for permission because it's accessing a file outside the project directory. So this is one of the uh security features of these AI models. Um I I'll say allow once because I don't want it accessing this file, you know, uh without my permission.
for those of you that already have Zap installed. Have you been able to add the MCP server?
So here we can analyze the changes that it's made as well which is just added the uh the zap mcp server here which is a local host server and then that's it.
So let's go back to zap now. So before you you start using the isap MCP server to do all sorts of scans, one of the things that you can do is um that is definitely recommended is to run like some kind of reconnaissance, right?
So just because there's AI involved doesn't mean you can skip steps. The reconnaissance and like gathering of context is still uh very important. So let's look at uh let's run vom bank locally which we're already in open code so why not by the way if you haven't joined yet. Um I would recommend joining the TAN uh WhatsApp community.
Lots of good stuff happening there. Um more sessions that the the Tan team is going to be organizing as well. So we can also continue the conversation there, right?
um if there's anything that we were not able to touch within the two hours that we have today.
So we've been able to run uh von bankank by just typing run vonbank even though the installation steps are actually quite straightforward.
uh we can go to the application here you know uh always use that dark mode and we can also go to the API documentation uh which is also very very important. So one of the things that we would do is provide the API documentation to the model.
But in terms of reconnaissance, um you would still do some manually u manual exploration.
So localhost 5,000 to launch the browser.
So here using Zap uh this is like the inbuilt browser within within Zap that uh automatically routes all your requests to the zap software. Right? So you can come here and explore the targets.
Uh actually while we are doing that let's edit the context to include our scope.
So we can include our scope to say you know uh localhost 5,000 and I'm not sure if the which where the API is uh which ports the API is using but I think I can just try to login Oh.
So, it's made a get request to login.
Okay, it's the same port. So uh if we go back to this context include uh local host 5000 dot star it would include everything uh here in the context. Very important because you don't want to be doing anything that's outside of scope. So you can just use this to define you know what the scope is and what you also need to do is like use as many of the functionality use every single functionality so that it can generate uh an accurate map of the targets. Without this you might be missing some functionality that has like you know very important vulnerabilities uh to to try and exploit. So, we're going to create an account.
Um, created an account. We've logged in.
We'll also try to like use as many functionalities as possible. Uh update the bio uh do some transfer loan requests.
Uh create a virtual card bill payments.
So, I think we've pretty much used all the available uh functionality within the user dashboard except the chatbot.
Uh for now, we're going to just ignore the chat box cuz that's a different uh different thing entirely.
So, now we go back to zap.
It should have built uh pretty good picture of like of the targets you know all the requests that are being sent to the back end uh transactions and even if you have like the API specification it's not it's not enough all the time because there might be some requests that are not properly documented which is you know could be a potentially really good target to to exploit.
So now that we have this, let's go back to open code and ask it to do some analysis.
So we're asking it to analyze the targets using the API documentation which is available here and the zap history which is going to use the MCP server to request from from zap.
Uh we can also ask you to create a testing plan uh for let's say security testing plan.
So what we're doing here is we're we're just using the raw uh agent prompt you know to to do our activities. If we wanted, we could also create um we could create custom skills that would kind of have much more details and much more guidelines in terms of what it should do, what it should do in each scenario and stuff.
But before doing that, I would recommend you know exploring uh multiple AI tools just doing it you know as um without so much like additional uh prompting. See what it does right, see what it does wrong. like over time you're going to build up uh more knowledge that you can then use to create your own custom skills and and tools and things like that.
Uh so it's been able to fetch the zap history.
No, it has not.
It's only been able to get this app history summary, I think.
Oh, okay. It's it's it got it got all the end points, I think. Yeah. So, it's doing an analysis on the uh authentication endpoints and it's telling us that um there's we should test for bopla or mass assignments which is very valid.
We should test for SQL injection.
Uh, so it's actually it it's done something I didn't ask it to do. It's looked at the Zap sites tree, which is good. It's looked at the open API specification, which I asked it to do, but it's also looked at the alerts that Zap has raised. So if we go back to Zap Zap and look at alerts, we'll see that it's it's reported some interesting findings here that can be used by the agent to uh to to explore further.
So it's using that information uh as well.
And then at the same time we have the code uh here. So it's looking at the source code as well. So it's gathered all that context and it's saying that you know these are the uh vulnerabilities that we should check that we should test for. So one thing that I'm not sure the zap uh so the zap thing doesn't require doesn't allow us to make custom requests I don't think.
Let's look at the tools. Info, create context, start spider, stop spider, start active scan, which is fine.
But what we want to do is to give we want to give uh open code the power to run specific tests. To do that, we need we need it to be able to uh send requests, right? So, I'm just going to ask it to do it. most likely it's going to use curl uh in the background.
So let's ask it to test this um registration endpoints for mass assignments or bopla.
So another another thing that makes this interesting is that it knows about this is admin thing because it's looked at the uh most likely it's either looked at the response from the API history or it's looked at the uh this the the schema within the API specification but no it's not documented said here. So most likely it's looked at the history.
So that's why you can only rely on the on the documentation. Usually what's the thing that's missing from the documentation is uh super valuable when it comes to testing. So we just ask it to test the registration endpoint for mass assignment uh slashbop and see what what it comes up with. By the way, how's it going? Like where where have you guys been able to get to? Any do you need any any support or are you just spectating at this point?
Okay. So we have uh a true finding. So basically it's it's gone and did a call request to the registration endpoint uh and setting this is admin uh parameter and it's analyze the response to see that okay it's actually true. So using this we can definitely val validate you know that this is a correct uh correct finding. It's also tested with like overriding the the starting balance for the account as well successfully and yeah basically all everything everything here works according to the AI. If you want to verify further for yourself, we can also go back to uh we can also go back to zap, you know, go to this registration endpoint, send it to the request tab and then modify this request to say uh to say something else. Where is it?
This is not the request I'm looking for.
This is get uh so the post request username and password send request.
So let's say sunman one is admin is equal to true.
So, it's coming back with the 401.
Oh, sorry. It's I went to login instead of register.
So, we register. Uh we can see that it worked out and it's it's validated to us that the vulnerability is present. So um we've been able to exploit uh bopla within like basically the AI agent was able to exploit bopla and it's given us enough information that we can use to like validate the vulnerability ourselves but again we've asked it to look at uh you know we've done the initial reconnaissance we asked it to look at potential vulnerabilities to test for and then we asked it to test this specific one. One thing that can also be done is ask it to like carry out the entire testing plan and it will go like you know step by step exploiting these vulnerabilities and testing testing it out.
Um I haven't gotten any feedback. How is it going? Do we have like uh any challenges?
Has anyone been able to to do any tests?
Okay, while we're waiting for some feedback, um I'll ask it to continue continue the testing.
So if you're not able to set up uh zap successfully, you can skip that part for now and just use uh use the API specification and uh you know ask it to do the same thing you Create a testing plan.
So as an example, one of the thing that it's trying to test for is uh JWT bypass.
And the way it essentially does this is like it it's writing um it's writing Python scripts on the fly to run uh like JWT commands. So you know in terms of like manipulating tokens, generating tokens and stuff like that which is interesting.
So, it's uh completed the tests and it's created like a result summary.
But I'm also going to ask it to generate like a full report so that we understand like what did they actually test, what did they try, reproduction steps and everything.
So, Justin, no, I didn't use a security.
I I uh let me go back to this app. I'll show you what I did.
So under tools and options uh and MCP integration I I removed this requirement to require a security key because I'm running this in a virtual machine locally on my on my device. Right? There's no network connectivity uh to this Zap instance outside of this VM.
So for the for that purpose I don't need uh the security key.
So Justin uh what's the update from your end? Have you been able to set it up now?
Cool. So now we have our we have our security test reports um that it's generated.
We can open the file. Uh I think in VS Code you can also open it in like preview mode as well.
Uh but I don't think open code has that right now I can't open the markdown file in preview mode but that's fine.
We can right. So my first prompt to open code was to add the MCP to to add the Zap MCP server running on port 8282.
And it looked for the open code uh configuration file and added the Zap MCP server configuration uh so that you could so that you can use the ZAP MCP server with open code.
Yeah. So, it's going to ask you for permission because it's accessing a file that's outside the uh the project.
So, if you're comfortable with the command or the file that it's about to access, then you can allow it. I'm guessing it's asking you to access this file uh or this folder, which is what you asked it to do. So if that's what it's doing, you can allow it.
Nice. So to test the actually what did I do next? Uh yeah. So next thing I did was ask it to run vom bank so that we can access vom bank locally.
But if you can't run vom bank locally, maybe due to limited resources or whatever, maybe you don't have Docker installed or or something, you can use vombank.org, which is uh a hosted version of Vombank, but I'm using it locally. Um, if you want to use VBank without running it locally, you can just go to vomank.org.
So that's that's your choice.
What did What did you run? That's stinking.
Oh. Um, do you have Docker Okay.
So, yeah, maybe uh it might be taking a while due to it's having to because it had to like pull the Docker images.
Can you can is it showing you like what it's doing or is it just thinking?
So, so it will pull the Docker images and run uh run them and then verify if it's running. But if that's taking too long, then maybe just use the hosted version.
going to ask it to create an HTML report so that to make it a bit more friendly to look at.
Cool. Let me know when that's done, Justin. So if you're able to if you're able to run it locally or access it access the hosted version then the next step is to add it to your uh zap context.
Uh yeah, add it to your Zap context and then do like a manual uh manual explore with the Zap inbuilt browser so that you can build uh build like a target tree of of the target, you know, build up the HTTP history.
Yeah, that's uh so yeah, what next uh is to run manual explore within zap.
So when you run uh I'm going to close this and start again. When you run manual explore within zap, you can provide it with like the URL um that you want it to access.
Then uh explore, you know, try try to use as much of the functionality as possible so that Zap can build like a a good picture of the of the targets.
So let me know when you've done that so that we can take it to the next So, are you are you running Zap in a container or is the AI hallucinating?
I thought you were using the Zap uh Zap UI.
Are you How did you install Zap?
Oh, I see. So, I don't know. Maybe it has something to do with uh snap.
Maybe the snap installation doesn't allow uh you to run it.
So one uh I think you should skip zap for now and do and just use the API documentation for the testing plan.
So I've sent through you a prompt I've the prompts without uh without using zap history.
Okay, that that means you have von bank running, don't you in docker Okay. Yeah. Then just just do it without um without zap.
So, okay, we're going to take a short 5 minute break. um come back at 3:35 GMT uh GMT + 1 and uh we're going to look at the reports that's been generated and then talk about other potential use cases um that you can you can experiment with.
So, you can use this break to catch up if you haven't gotten to this stage yet.
Um, again, what we've just done so far is added the Zap MCP server. I did a manual explore, but if you don't have Zap running, you can skip the manual explore and just run uh create a testing plan using the API documentation without this app history.
And then you can ask it to test any specific uh endpoint or you can just ask it to run the entire uh testing plan.
So yeah, we'll reconvene in 5 minutes.
Okay, we're back from our short break.
Um Justin, before we went on break, you were trying to run the testing plan, right? How's that going?
Okay. Is it working out fine? I think that should be quick. That should be quite fast.
Okay. Were you able to troubleshoot that then?
Is it working now?
Okay. Yeah. Nice. So, you're almost where I'm at. So, you can ask it to run a specific uh test. You can choose anyone or you can just ask it to run the entire test plan. So what I did was I asked it to run this uh bopla check first and then I asked it to to go ahead and do the entire thing. But I think you can just ask it to do the entire uh the entire test.
So, um I've asked it to generate a HTML report from uh for the test And yeah, this is the reports.
Yeah, again you can you can manually validate any of the findings by uh using Zap to to run those requests.
again you know it's uh it's it's something that you can easily validate because it's showing you you know how how it arrived at that conclusion right so with this uh bller it's showing us the specific request that it used you can copy the raw HTTP request and paste it into the requesttor tab in where's request Yeah.
In the request tab in Zap.
Uh yeah, here we go.
Okay. This is not an exact I think it's not in the format as app expects. Um, oh, I see. Yeah, you can, you know, you can try out any of the exploits to manually verify by yourself, which is what you should be doing anyway.
And yes, it can. So with this bopla for instance, the exploit or PC is basically like adding this these values to the um to the request. That's pretty much it.
If it was a different type of vulnerability for instance like this uh SQL injection, the PC is just this this request, right? So it can run request using curl uh as well. If we're using the Burp MCP server, it also that also allows it to run requests and you can see the history as well.
So yeah, you can depending on the type of uh what kind of permissions denied are you getting Can you send more details about the error you're getting, Justin?
Uh, so something interesting that it's done here, it's it's chaining multiple vulnerabilities, right? It's able to use the the fact that there's a weak JWT secrets and an SSRF vulnerability to extract the secrets. So there's an internal endpoint uh for secrets uh but it's reachable through the upload profile picture uh URL by forging a request and getting the secrets uh by forging a request uh to this endpoint through here. it's able to get the secrets and then uh you know use the secret to take over any account pretty much.
So really interesting How's it going, Justin?
Another thing that interesting that the agents did is it's attempted uh prompt injection through the AI agents that the application has and it's able to use it to uh get you know sensitive information that's not supposed to be able to get.
So we're about 15 minutes to time. I'm going to stop this particular demo and um go back to the slides. Uh but before I do that there there are lots of other tools that you can integrate, right? There's um something called sockets which basically does like secure composition analysis, you know, finds out if you have uh vulnerable dependencies and stuff and they have a free MCP server that you can add as well to to analyze any of the packages that you have.
Um, so yeah, it's quite straightforward to install and you can use it to check uh the security of any of the AI uh dependencies that you have.
There's also uh ghost security which is another really interesting um tool that by default it works with uh it's designed to work with cloud code but I'm pretty sure there's a way to make it work with open code. It could be another project you can experiment with. So it it basically combines uh multiple souls. So it has this secret scanner uh that's like a real like a binary that runs a deterministic secret scan. Uh also as a dependency scanner as well.
uh from my experience I I still prefer using like sneak or or socket for the secret scanning. So if you're building a a workflow that uses ghost security for instance, you can swap out any of these tools. I think the most impressive part of it is the fact that it has uh it has like different skills and tools and it has uh an orchestration skill as well that can orchestrate an entire security review. So this is an interesting one to to check out uh and test as well. There's also bug trace AI. Ah, I think I Okay, it's bug trace aai.
But it's been archived.
So, there's a new version of it that's that's been developed.
But yeah, this is another like this is an offensive security testing uh tool that you can try out. There's so many tools out there that I definitely want to encourage you to try, but the most important thing is like or try try them out in a safe environment like check out uh the code, make sure that it's actually an an actual legit project, right? check the reputation.
Uh see if like there's there's like any legitimate um websites that talk about the project.
Don't just run whatever unless you're running it in a sandbox.
Uh yes, definitely interesting trying out all these different tools, you know, seeing how they work, seeing their capabilities. the more of them you you test um I think the better you'll be able to understand the the capabilities and and you know what's what's possible and then you can from there you can start using your creativity to uh to be able to do more.
So um another thing you can do is with something like open code or cloud code or anything any of these uh agentic applications they can actually run like it doesn't you don't have to link it with a with an MCP server if you have a tool that's just a command line tool you can still run them uh you know these these tools have like terminal access that they can use to run uh some of these these tools.
So, uh you can definitely try out, you know, see see how that works.
So, I'm going to stop sharing my screen and go back to the rest of the slides.
Thanks.
So, yeah, we've done we've done some hands-on demonstration.
Um it's it didn't go perfectly but I think there's some lessons that we can take from there for for our next um workshop or demo.
And we've been able to see like how powerful this this kind of tool is when it comes to automating tasks and or or doing like specific tests. Like if you know what to look out for, you can guide the soul into doing uh doing some of the the heavy lifting.
And yeah, this these are just some ideas of some of the projects that you can test with.
Um, you can build like a threat modeling assistant that's relevant to a specific organization's uh environment.
You can find the part of your workflow that's repetitive and automate it. You can test different agentic security tools like I mentioned earlier. Um, you know, at some point you can get to a point where you can build like build a slack bot to automate an app process.
I think what matters at the end of the day is have some part of it has to be like uh based on deterministic tools based on tools that have like repeatable outcomes.
The the AI part of it is helpful with um automating defined tasks, right? So if you have uh specific things that you want to do, it's also good with like generating ideas. You know, we generated the test plan there with ideas of like different types of tests or attacks that we can run on different endpoints. Um so there's a lot uh lots that you can do with it, but the more you test it out, the more more you'll be able to do because you can you can try out different things.
So yeah, um it shouldn't uh be you shouldn't end up in a situation where you're just asking AI to do stuff without understanding what's going on in the background, right? You need to still have that background understanding.
Uh but it shouldn't replace your thinking, right? you should still be be the one in control and guiding the tool instead of it driving everything right.
So you still and and uh we mentioned this earlier as well. You still need to validate your findings. You still need to provide it with context like it it can only be as good as the context that you provide it. So trying to pro try to provide to provide it with as much context as possible and make sure you're still reviewing it. Don't go into yolo mode and allow it to do whatever it wants. Um, try to make sure you're checking out everything that's doing and approving any kind of sensitive actions. So, yeah, if you have any any questions, uh, feel free to ask before we close out the session.
Yeah, thanks Justin. I think you've been the most active um participant here.
Abigail, I don't know. I think someone asked for the WhatsApp link. I don't know if you can provide that or if the maybe the people that registered if they can get the uh the link that would be nice.
So yes, that's that's one of the most sensitive or most dangerous parts of of this agentic AI tools. the the fact that it can do things on your machine and the decisions on like uh what to do sometimes is is created by the AI. The commands that it runs uh are created by the LLM as well. So if it's makes a mistake or if it gets some kind of indirect prompt injection then you know a lot of bad things could happen. That's why you need to review the command and make sure you're comfortable with it. Like would you be running that command on your computer if you were the one that's why you it still requires that kind of uh approval.
So yeah, it's definitely one to watch out for, but it's uh yes, you should be able to go back to any of the sessions and uh see everything, audit everything that that it's done.
What types of security findings does AI consistently miss? So um it depends right it can miss stuff when it doesn't have context of a particular thing or it can miss uh maybe like very the funny thing is that there are things that it would find that a human might not not necessarily not be able to find but not uh a human might not think about uh all the different possibilities, right? But it's also the same thing like the LLM might just miss something because it's an LLM because it's not uh it's nondeterministic.
It doesn't guarantee the same output every time. So there might be some things that will get missed in a particular uh run, right? It just depends on uh the context that's been given the information available to it at a given point in time. But it also depends on just pure vibes sometimes. Um so it's not everything that um like the one that we just did with Von Bank is very obvious in the sense that like Von Bank is a deliberately vulnerable application, right? So there's lots of obvious vulnerabilities there. But if it was uh if it was like a real enterprise application for instance, you might need you definitely need a lot more guidance, right? So if you're asking it to just like find all the vulnerabilities in this application, make no mistakes. Of course, it's not it's not going to be as successful as if you guided it in the right direction, provided the right context, and do the thinking yourself, right?
And I think the the reconnaissance and the testing plan phase are also very important because if you ask if you ask it to just start doing the testing there's no plan there's no like there's no background context about the application you know it's just going to be all over the place. But if you if you ask it if you do the reconnaissance and you ask it to create a testing plan, it would then look at each of the uh end points and each of the the the targets that has already been mapped out and then try to come up with potential uh vulnerabilities to test for. Right?
So with that you can then um you can then start executing the plan and then there will be a systematic way of looking at it right is the knowledge of using AI tools in a security context now required for modern apps sec roles I I would say it's definitely something you should be aware of because organizations are adopting AI for better or for worse it's it's the reality that we are in right And um part of your role as an appseac engineer will most likely entail how to secure the AI systems that the rest of the organization is using.
you know um if the developers are using cloud code to generate uh code that's been added to production how how do you make sure that the code that is gener generating uh does not contain vulnerabilities or how do you make sure that they're not adding uh malicious MCP servers or malicious tools to their AI agents. So that's one part of it. The other part of it is um yes the knowledge about AI and knowledge of using AI tools in security context will be helpful because you can then start to work more efficiently in terms of automating some of these tasks. Right?
If we wanted to test for um if you even want to create like a testing plan uh for example it would take you a lot longer if you were doing it yourself but with with AI assisted um process it allows you to do it much faster. So yeah, I think the the conversation is definitely helpful. I think AI is overhyped a lot. Um, and the the business model currently that a lot of companies are using might not be might not be sustain sustainable like the prices that a lot of people are paying right now is not the actual cost that is costing the companies. There are a lot of things that will self-correct in a few years time. It might be a bubble whatever the case may be but the technology itself is transformative in the sense that it's it's going to be relevant in some way shape or form. So as an absc uh engineer, it's definitely something you should be aware of and uh it's something that you should experiment with, see how it works. Um understand the dangers and and how to secure it so that you can apply it, you know, wherever wherever the future takes us because it's still like it's it's still uh in the early stages.
Thank you for all the questions. I don't see any other questions. Um Yeah, I think that's it for me. Um Abigail, do you have any anything to add before we close?
>> Thank you everyone for joining.
All right. So, I'm going to end this stream here.
All right.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











