This project effectively demonstrates how LLMs can democratize FinOps by translating complex cloud configurations into clear, actionable cost-saving strategies. It’s a pragmatic application of AI that addresses real-world operational waste beyond the usual chatbot hype.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
AI DevOps Project #2 - AI Cloud Cost Detective (Reduces cloud billing)Added:
Hello everyone, my name is Abhishing and welcome back to my channel AI cloud cost detective. Yes, we are going to build an AI cloud cost detective in today's video. Just imagine this, you have an AI utility. You build that on your own. It connects to your cloud platform, looks at the resources on the cloud, looks at the configuration of the resources and instantly tells you how you can modify those resources to bring your cloud cost significantly down. A lot of times you misconfigure the resources or you overprovision the resources. But this utility can tell you what changes you can make and bring down the cloud cost.
Wouldn't that be awesome? Now you might say but Abishek there are some opensource projects like Carpenter. Yes these projects are good but they are restricted to certain cloud services.
For example Carpenter works well with Kubernetes clusters but with the help of AI what we are going to build today is something that works with most of the cloud services and it works with your virtual machines. It works with your S3 buckets, Kubernetes clusters, databases, volumes, anything. Most importantly, you are building this on your own. Let's say there is an open source project tomorrow. But still, because you made it on your own, tomorrow if there is a requirement within your organization, you just provide a prompt, modify the tool accordingly, and it is ready in minutes. Right? By the end of this video, I will make sure you have everything that is needed. And after watching the video, you can go back and build this by your own. Just like any other project on the channel, I've created a GitHub repository for this as well. The GitHub repository has architecture diagram. It has the request flow and most importantly the chrons that we are going to use in building the project. We will do this in five simple steps, right? Just to make sure everyone understands the project clearly. I'll break down this into five steps. First step, problem statement. I'll explain you what is the requirement. Why are we building this project? What problem are we solving? Second, I'll explain the tooling that we are going to use, what tools are we going to use to build this project. We are mostly going to use opensource tools. Then I'll walk you through the architecture diagram, a detailed architecture diagram so that you understand how each component is connected to other. Fourth, I'll walk you through the request flow so that you understand how a user's request flows within the platform. Finally, implementation.
Then I'll head to my AI coding tool.
I'll start building this application.
You can use your GitHub copilot. You can use cloud code. You can use cursor anything that you have access to. End of the video as I told you you will be building the project. Perfect. Now without any delay let's get started.
Let's start with the problem statement.
Abishek why are we building this AI cloud cost detective? Is it really important? Very valid question. Let me explain.
Majority of the organizations today, it can be startups, it can be midscale companies, large scale organizations, they are deploying their workloads on the cloud platforms today. From containers to virtual machines to S3 buckets, databases, volumes, everything is on cloud today. Cloud is definitely good. There are so many advantages of cloud. Cloud is very reliable. It's durable and most importantly cloud resources can reduce the management for you. But on the downside there is also a downside. If the cloud resources are not provisioned well it can significantly cost your organization.
Unfortunately this is what is happening with majority of the organizations today.
They expect their cloud billing to be let's say $20,000 by the end of the month. But what they see is $50,000.
Why is this happening? Maybe the junior DevOps engineer has misconfigured the cloud resources. Maybe did not follow the best cost optimization practices. Or maybe the junior DevOps engineer has overprovisioned the cloud resources. Provisioned more than required.
End of the day it will cost a lot for your organization.
Abisha what organizations do today they can't do much they just pay to the cloud platform whatever the cloud platform is charging them they pay to the cloud platform and as an afterthought maybe end of the quarter end of the year they just go to the DevOps team they ask the DevOps team to prepare a report which is usually called cloud visibility report and in the report they want to see which team is responsible for how much cloud cost in the organization and maybe as a future project they will introduce cloud cost optimization.
This is what happens in majority of the companies today. Now is it good? Kind of good because at least they are taking afterthought measurements. They are getting the report and in future they have steps to optimize the cloud cost. So it's kind of good. But what we will build today is an AI tool that will not act as an afterthought, but it can prevent organizations from huge cloud costing.
Instead of waiting till the end of the month and paying that $50,000 using this AI tool that we are going to build, organizations can reduce that to maybe $30,000 or maybe even $20,000 as per their expectation. Sounds good, right?
Abishek, how is this going to work? I'll just give you some very high level information now because we anyways are going to talk about architecture in step three. But how this works? The AI tool that we're going to build, it connects to the cloud platform. In today's video, we will use Azure cloud. Lot of you are asking for Azure. So, I decided to go with Azure cloud in the video. But it doesn't matter if you want to use the same project for GCP or AWS. It works as is only in the prompts that you're using. Change Azure to AWS or GCP. You don't have to change anything else.
So your AI platform connects to the cloud platform. It looks at the resources, all the resources within a resource group. It gets the configuration of the resources because it is AI large language models. It gets the information of the best practices.
So it compares your existing configuration with the best practices configuration and then it provides you a detailed summary like for S3 buckets this is a change that you can make for virtual machines this is a change that you can make in fact it can also provide you some commands if you need so this will act as a prevention mechanism and this is how it is going to work we will make it enterprise grade we will add authentication to it we will add storage to it we will add a very good database to it so that you can also see the investigation history. Perfect. So I hope at this step problem statement is clear. So the problem is that companies are paying a lot to the cloud cost uh cloud today because of misconfiguration or overprovisioning the resources. Our AI cloud cost detective can prevent this by connecting to the cloud by fetching the resources by fetching the configuration and comparing the configuration with the best configuration from the Azure documentation or from other resources on the cloud. It can also suggest the changes that the DevOps engineers need to make. Cool. Now before going to the architecture, let me quickly talk about the tool set. I know a lot of you are interested in this because you want to understand can I actually build this project or not. You can build it. For front end we are going to build a full stack application an agent application that has front end back end database everything using AI. For front end we will use React.
React is userfriendly. Users can get to the uh platform they can use it in a very easy way. If we build it using React then for the back end or for the programming language we will use fast API fast API is basically Python.
So using fast API we will fetch the resources from Azure.
Then for the database we will go with Postgress database. Postgress is very popular open-source database and it also comes for very less pricing. So we will go with posgress in the video. In one of our previous projects we used ins. If you want you can try ins as well that is also good.
Then for the authentication we want this platform to be enterprise ready. So we want to implement authentication as well. Even for authentication we will go with postgress. So we will store user information in Postgress.
And then we need model gateway. What is model gateway? We have to send like back end collects the information of Azure resources. We have to send that Azure resources to let's say to large language models. So we will be using open AI in this case. Don't worry if you're confused at this point. Wait for the architecture. Once we head to the architecture everything will be clear.
That's it. These are the tools that we are going to use. Of course for cloud I'm using Azure and for fetching the Azure resources that is for the data I'm going to use Azure CLI. Abishek why not go with Azure APIs. You can go with Azure APIs as well but the problem with the APIs is that APIs keep changing. You have to make sure you're using the right version you are using uh the right input to the APIs because it keeps changing down the line your maintenance with the application that you are building will also be more we want to keep the maintenance low so that's why I'll go with Azure CLI so this is the tool set that we are going to use in the video now let's get to the interesting part architecture explain the detailed architecture of the project that we are going to build. Why not? Let's talk about it. Firstly, users. Who are the users? It can be developers, DevOps inserts or your management. Anyone who wants to use this platform, they will just head to the user interface of the platform or we can say they will head to the front end of the platform.
within the front end the first thing that they see they will not see connection to Azure but they will first see the authentication we don't want everyone to use the platform we want to restrict who can use the platform maybe a junior developer should not use it or maybe a junior tester should not use it someone in the operations team maybe they should not use it so we will restrict the authentication to Once the authentication is successful, let's say the authentication is successful, then user would see an option to provide the Azure resource group. So either they can select the Azure resource group from the user interface or they will provide the ID of the Azure resource group.
Now once this is provided, this is the key part. This is where our back end comes into picture. That is our Python comes into picture. What our back end does? Back end connects to the Azure CLA. It will make use of Azure CLA and making the use of Azure CLA.
It will fetch the resources from the cloud. Port all resources are available in the Azure resource group. And once the information is fetched, it has to fetch a lot of information. So we will use multiple Azure CLA commands here.
Good part about the CLA commands over the period of time they don't change. If you look at Azure CLA command for fetching virtual machines today and maybe 3 years before the command is almost the same because CLA always maintains the compatibility 99%. I cannot say 100%. Okay. Once this resources is fetched because this is huge information this is sent to AI.
This is where we are going to use open AI API key.
If you want to reduce the cost, you can also go with open router API key. But I also I mean but I already have open AI API key. So I'll go with it. It doesn't matter. If you have anthropic API key, go with it. If you have open router API key, go with it.
What AI does? It performs the analysis.
Right? the critical part of our entire architecture. AI performs the analysis.
AI looks at all the cloud resources. It looks at the best configuration for those cloud resources. And once the analysis is done, it provides us with a summary. Like it tells the user, hey, these are the steps that you could have done better. You have configured Kubernetes cluster with six nodes.
Probably you only need three nodes. You have created virtual machines with a particular instance type. You don't need that instance type probably you can go with a lesser instance type and along with the summary it can also provide the steps and this is shown to the user back on the front end or on the user interface again the same path information is stored within the database. This analysis is stored within the database where we are using the postgress database and from the database back and fetch the information and shows it back to the user. This is what happens end to end in the application agent that we are going to build or the agent application that we are going to build. Let me quickly explain this one more time. This is the important part. Please try to understand. First users talk to the front end. Front end users cannot connect to the cluster or to their Azure cloud platform. There is an authentication in place. We are using Postgress for this authentication.
If the user is authenticated, user can provide the resource group.
This resource group name of the resource group is taken by the back end where we are using Python or fast API.
Back end makes use of Azure CLI. It connects to the Azure cloud platform and it will fetch information from the Azure cloud platform.
This information is sent to AI where I'm using open AI API key but if you want you can go with open router API key as well. Then this information is sent to analysis.
This entire analysis is summarized and shown back to the user. But to keep the history of the information, to keep the history of the investigation, maybe I ran this investigation on 29th of May.
Someone else also want to see it instead of running it again. If the platform provides investigation history, they can just go to the history board or they can just go to the uh previous investigation and see what has happened. So they don't have to run this entire thing again.
And yeah, you can find the same information in our GitHub repository as well. If you head to the readmi file, you will find architecture section where you can find the same thing that I explained on the whiteboard. Probably little more detailed. Along with that, you can also find request flow section.
Abishek, why do I have to write this in the readme file? There is an advantage.
See when we are planning to build an application using AI assistance it can be cursor it can be GitHub copilot cloud code or even open cloud with local models I mean using Olama when you provide architecture diagram and when you provide request flow there are chances it can build your application in the very first run when you provide prompts of course I'll walk you through the prompts we are going to provide very detailed prompts as well. But still there is a chance that your AI coding assistant might get your requirement wrong. But when you put the information in this way in the architecture diagram in the request flow, there are very less chances that it can go wrong. That's why I've documented this information in the GitHub repository.
Perfect. So now we understood the problem statement. We understood the tool stack, the text stack that we are going to use. We are clear with the architecture diagram. We are also clear with the request flow. So what's next?
The next step is implementation.
But before we implement, there are two prerequisites in this case. One, you need an Azure account. In my case, I have created an account and I've created two resource groups here. Let me increase the font.
So you can see this is my Azure account.
I have created two accounts here. I mean two resource groups here. The payments request uh payments resource group where I have bunch of resources in here. For example in the payments resource group I have a engineext virtual machine. Then I have networking related to the engineext virtual machine. I have some premium disk. I have orphan data disk. Along with that I also have a storage account.
Similarly even in other resource group I have created bunch of resources. If you look at the middleware resource group I've created storage account here as well. I've created a bunch of virtual machines the networking of the virtual machine and I've created some orphan discs. Basically some disk that are not used by the virtual machines.
So our Azure account is ready. You can also set up something like this. You can ask, let's say you're using AWS, you can ask chat GPT to give you some AWS CLA commands and create some infrastructure on AWS. Or if you have expertise, go ahead create some resources, create some Kubernetes clusters, make your cloud platform ready because end of the day, we want to play with the cloud environment. AI cloud cost detective. It works better when you have more resources on the cloud. Only then it gets the opportunity to look at the resource configuration see what you have configured wrong and then it can detect and give you the suggestions. Perfect.
So our Azure cloud environment in my case is ready. Now the second thing we need open AI API key. Even I have that thing ready as well.
I use OpenAI API key. So I have it ready. All that you have to do like let's say you want to use OpenAI API key. You have to get to platform.openai.com.
Then you have to get to the API keys.
Click here. Create a new secret API key.
Provide a name. I'll say um let's say let me call it detective.
Choose a default project or you can create a project. Then click on create secret key. Once you have this key, copy it and paste this key within the env file of the project. Basically, you can clone this GitHub repository within this GitHub repository. You can create av file and paste your OpenAI key there.
Perfect. I'll create a dummy.env file here so that you understand where exactly to copy the Open AI API key.
Abishek I don't have open API key it's okay you can go with anthropic API key or open router API key which is much cheaper in the previous project I used open router API key that's why this time I'm using open AI API key that's it there is no other specific criteria cool so both our prerequisites are also ready now let's head to cursor cool so I have a cursor subscription So I'll be using it. I'll go to the agent section and I will start with the first prompt here. I've cloned the repository.
I've created file as I told you. This has my open secret. I'll start with the very first prompt. Go to the GitHub repository. Can you guess what will be the first prompt? The first prompt is to create the Python back end. Right? What is the core piece of our entire architecture?
If you look at this entire architecture, the core piece is the Python back end.
So, Python back end will take the resource group as input. Then it will make use of Azure CLA. It will fetch the resources from the Azure cloud. So, if you get this piece working, right? If you can get this piece working, then rest of the things are little easier.
Okay?
Let me show you how I wrote this prompt.
So first of all, I explained create a Python fast API back end in backend folder. So I'm asking it to create a folder in the cloud cost detective project.
Okay, that's fine. A fast API server with post/analyze input endpoint that accepts resource group. So this is what I told you the Python back end should basically accept resource group as input. User will provide this information from the front end. Then I said get/ API resource group endpoint that returns list of Azure resource groups. Okay, this is fine. Then use the Python subprocess module to run Azure CLA commands. You understood why I wrote this here in the prompt? because we want to use Azure CLI. If you don't put these lines here from here to here, if you don't put these lines in the prompt, by default, your AI coding assistant that is cursor would go with Azure APIs and Azure APIs can constantly change. Maybe two months down the line, somebody will report that your project is not working. CLA is more compatible.
Then the basic things like enable course uh include requirements.txt and this has to be the structure of it. This is the line that plays the critical role.
Abishek why did you add architecture.md and requestflow.mmd? I just explained even in the prompt I'm explaining within the prompt refer to architecture.md and requestflow.md.
This is why I added these md files.
Imagine if I don't provide this and if I provide this entire prompt there is 10 to 20% chance that it might create the Python back end for you but not as per your requirement. When you provide this then your chances of efficiency will increase by 10 to 15%. Cool. Okay. So I'll take the raw mode. I'll copy this entire prompt and paste it on the cursor. I'm using Opus 4.6 high. If you want, you can switch the model. Composer 2.5 is also good enough. It will consume less amount of tokens. Let me switch to that. Okay. Now, I'll provide this prompt and let wait for 2 to 3 minutes because this is going to take time. It has to install Python. Then it has to download the required dependencies. It has to find Azure CLA on my machine. So let's wait for 2 to 3 minutes here.
Nice. So this is done. In fact, it took less than 2 minutes. It created the back end where it created main.py, azure scanner.py. It also created cost detector.py.
Let me quickly walk you through these files. main.py the primary entry point.
Then you have the cost detector.py. This is the actual file. I mean the core file. So what it does it lists out the resources and things that our cloud cost detective is looking at or detecting.
What are the things that it is detecting? One oversized virtual machines. So it is trying to see if there are any Azure virtual machines on your cloud platform that are oversized or overprovisioned. Maybe requirement is only two CPU but it is created with 5 CPU.
Orphan disc that are not connected to the virtual machines.
Stale hot blobs. Blobs are I mean if you are aware of Azure blobs are like S3 geo redundant storage redundant storage accounts unused static public IPS. Wow.
This is something that even I did not expect. Probably if I'm writing something like this I wouldn't have thought of unused static public IP. This is a good one. Why? because lot of times you create public IPs uh especially if you're working on enterprise and sometimes you delete the virtual machine or you move to Kubernetes you forget about that public IP that you have created of course it does not charge a lot but still in AWS Azure and GCP elastic IP or the static IP they are charged so even the one that we are building is looking at any stale or unused static public IP load balancers without back end. Good permissive NSDS oversized OS disc same thing like just like oversized virtual machines oversized tests unused file shares like file shares are like EFS on AWS Abishek what if I want to add more okay these are 10 good things but I want something like app services no worries the logic is here what you have to do just create a new line oh it also I mean uh a also picked up app services. very surprising but anyways like if you want to add app services just add the flag here then follow the implementation for example you can just follow this particular implementation uh detect permissive energies see how is this function written or definition written you can add the same thing for app services as well I'm not good at python then don't add this flag instead just head to cursor and add a prompt here enhance the agent application to also detect stale app services, right? Or issues with app services. Similarly, if you want to add uh storage accounts or anything more, you can go ahead and do it. It's that simple. And this is the flexibility I was talking about. If you go with an open source project, if you're looking for a feature, they might not have it. And if you want them to add the feature, you have to create a support ticket. You have to wait for them. But in this case you can add any feature in matter of minutes and the complete control is with you. Good.
Okay. Then in the Azure scanner I think you might guess it. What is this doing?
This is the file that is connecting to your Azure console using Azure CLI and it is getting the resources. Basically this is getting the resources on the Azure cloud platform and the cost detector is trying to detect some information but end of the day okay this is the complete Python back end but end of the day without AI analysis is in incomplete now let's add the analysis part how do we add the analysis part go back to the repo in the repo just go back to the prompts folder you will find open AI analysis go to that open AAI API integration for cost analysis.
Build on top of existing fast API back end. By the time you're running prompt 2, the prompt one has created the fast API back end. So that's why build on top of existing fast API back end. Add AI powered cost analysis using open API directly.
Create AI analyzer.py module. Take the list of Azure resources that we get from Azure scanner.py. As I explained you here, Azure scanner.py gets the resources within the resource group. Now AI will scan those resources.
Build the prompt asking AI to analyze the resources for overprovisioning.
Yeah, the same thing. Whatever we have in the detection file, detect py call the open API chat. Open AI chat. we are asking it to use the GP40 model. If you want you can go with uh GPT 5.5 or any other model. In fact, you can ask it to use the uh different models. So, OSSGP model as well depending upon the tokens that you want to consume.
Then the basic information again even if you look at the prompt end of the prompt I have provided the same thing refer to architecture.m MD and request load.md. If you look at any file, you will find I mean any prompt you will look at this line at the end.
That is the reason why the efficiency is more in my case. In the very first prompt, cursor was able to generate the back end as per my expectation because of this line. The same thing will happen with prompt two as well. Okay. Now let's go back to cursor.
Either we can create a new agent or we can continue with the existing one.
anything works. Okay, I'll close these agents. I'll click on a new agent. And here I'll say just paste the prompt.
Now this is also going to take another 2 to 3 minutes. Uh this time it has to just collect my or understand where is my open AI key, the API key. Then it has to uh frame the prompt. Once it is ready, I'll walk you through the Python file as well.
This is also done. So I can see AI analyzer.py is created for us. This is getting the information the resource group information.
It is creating a system prompt and it's sending the resources to the GPT4 model as expected because we provided GPT4 in the prompt. If you want to change change here. So till now what did we implement?
We created the fast API back end. So fast API back end takes the resource group name. It detects the resources using detect cost flags the 10 flags that I shown you. It sends the information for uh to the open AI models for analysis. This is what we implemented till now. What are the missing pieces? Let's go back to our architecture. So here we implemented these things. the back end we implemented open a analysis before we head to the front end uh before we head to the end to end integration one piece that's left the posgress implementation posgress in this case is performing very interesting things for us first see we don't want users to run this platform very often like let's say we hosted this AI cloud cost detective 5:00 Somebody uh someone provided a resource group and they got the information then 5:15 if other person runs for the same resource group what's the point nothing is going to change in 15 minutes or nothing is going to change in 1 hour of time but this will only waste AI tokens for your organization or AI is going to charge your organization. So what we will do instead of this we will maintain the database which stores investigation history or it will store the past detections. So at 5:15 if someone wants to run they can just go to the history see if someone executed it today for a specific resource group. If they have executed the detection for the specific resource group today, they can just view at the history.
Along with that, whenever user runs the cloud cost detective, they should see the realtime progress. Right? For example, this is taking 2 minutes of time. You should show something on the screen that it is currently detecting virtual machines, currently detecting stale blobs, currently detecting the static IPs. So user has to see some progress on the dashboard on the user interface and finally authentication. Authentication as well as authorization. This is also important. Right now we will only implement authentication.
All these three things will be done by Postgress.
So this is the pro prompt for Postgress.
If you go through prompt three, Azure Postgress websocket empty. Let's see what is in it.
Build on top of existing fast API packet. The same thing whatever we have till now. Add Azure managed postress for storing user analyzing history and fast API websocket for live progress updates.
Same thing I just explained but we are using Azure managed Postgress service.
What is the database to use? I mean I'm providing little more information. Go for Azure managed Postress connect to this. Then I'm asking it to store the DB connection string in ENV where I'm already storing open AI API key. Then I'm asking it to create two tables. I'm providing the table information. Abishek in the last project we did not provide table information. We did not provide database schema. Why are we doing it in this project? So if you remember in the last project we used ins. Ins is basically a native back end but posgress is not designed as an a native back end.
So we have to provide more information here. That is what I'm doing here. And also I'm explaining in term of realtime progress I'm telling AI I should see something like this. When user runs my cloud cost detective they should see what is the current status. For example fetching resource group, scanning resources, analyzing cost, storing results. If it is going to take 2 minutes, user is engaged in the meanwhile and user knows that my agent application is not struck.
Just provided an example of env.ample and finally refer to architecture.md and requestflow.mm.
Let's go to the raw mode. I'll copy this entire thing and I will paste it here.
Okay. Now, let me just put that information here.
As expected it took little more time but you can see prompt three implementation is also complete. It created a db.py file progress.py and o.py file individual python files. In the db.py you can see uh there is a table. This is the analysis table. This is the users table and the logic to connect to the database. The postgress database. The post plus connection string is as part of the envample file or the env file.
Okay. So this part is also complete. So we have our postless database also set up. So all this information is complete.
Now we are only left with front end implementation and front end connection with the back end. So two different prompts. In the first prompt we will create front end. Let's see if we like the front end, we will use it in the first go. If it does not develop the front end, well, if you're going with AI agents like Lovable or other solutions, you can be rest assured front end will look good. But because we are going with cursor, I doubt in the first go if the front end is good, we'll use it or otherwise we will modify the front end and in the next prompt we will establish the connection between front end and back end. that will complete our agent application. Let's do that. So we will go to prompt four.
So prompt four is to create react front end. Okay.
Create the react front end in the front end folder just like the back end folder and make use of JWT as we discussed right. Authentication will be implemented using Postgress and that will be using Java web tokens. And here we are saying set up using white. White is the server for running front end react and typescript. For CSS we are asking you to use the tailwind authentication specifically using custom Java web tokens. I provided little information on how to use the authentication. For encryption I mean the end points for authentication like signup page, login page and for encryption I'm asking you to use brypt.
And these are the different pages on the front end like the login page, dashboard page, analysis report and the history whatever we discussed. These are the four pages on the front end. If you want you can modify it like if you feel like abishek I want everything in one single page. I don't like the multi-pages then you can change it like put everything under one page. It's up to you. And finally refer to the architecture.md.
Cool. Let me copy this and paste it in the cursor. Front end will not take a lot of time. Probably in 2 minutes we should have the front end developed for this application. Let's see once it is done we will track the progress.
So the front end is also complete. You can see JWT is also set up. I mean authentication is also complete. But I doubt it will work at this point because although we created the front end for AWT to work, I mean JWT to work the Java web tokens and authentication, it needs a connection with the Post database. So we still have to implement front end integration with back end. So my bad it might not work. I mean front end is created I'm sure but front end will not be able to uh connect to the database at this moment. So let's also go ahead and run our prompt five. Integrate front end and back end.
We cannot review front end at this point. So let's complete this. Integrate front end and back end. Basically end to end. What are we saying here? Connect react front end with fast API back end via websocket for live progress authentication for all the routes and display the history. more information here like the API integration information, websocket history information and I'm also asking it to perform a final test so that we don't have to go see something is wrong.
Instead it can perform the final test and see if everything is looking good.
Cool. So let me run this and once this is complete then instead of just reviewing the front end let's review the end to end application. Cool. And again this is a very simple thing because for front end and back end integration it just has to use an existing SDK and with the SDK is it will buy up the things postbased database is already available looks like it is done I can also see on the terminal so there are two terminals I can see the server is running I hope it is done let's see so we'll go back to the same thing I'll provide again the same uh email address. Let me provide the password for it. Okay. Now we have the front end. Front end looks very minimalistic. Uh here is the email address log out option dashboard. It also has the history and on the dashboard it provides a toggle button where I can or drop-down where I can see different resource groups. Same thing on my Azure cloud. Right now I just have three resource groups. Let's select one of the resource group. We will go with the payments resource group and see.
I'll quickly run the analysis.
Okay. Websocket connection failed.
Ensure that the back end is running on port 8,000.
Is the back end running? Oh no. Okay. So we can clearly see here. We'll resume when background work finishes. Looks like there is some issue with back end.
Even here I can see the error. We'll just paste this error. Uh we don't have to debug this. I'll paste it here or the error that I have here on run analysis.
Let's see if AI is going to fix this for me.
I hope it should be a straightforward issue.
It fixed the issue for us. It says uh the API is now running at uh 8,000.
Let's see one more time. We will test it. I'll refresh.
I'll select the resource group again.
Payments a run analysis. Perfect. So it started running the analysis. Analysis is in progress. Fetching the resource groups. Scanning the resources.
Analyzing the cost with AI. Storing the results and analysis is completed. So these are the uh things at the moment.
Okay, it took like 10 15 seconds but I can see this is the analysis report. Uh it shared the resource group resources scanned at 10 issues founder six estimated monthly savings is $200 to $500. See this is a big thing. I just created 10 resources on my Azure cloud.
It scanned the 10 resources. It figured out there are six issues. It provided a summary as well. The payments acquire resource group has several high se high high severity cost issues including oversized VMs orphendes and multiple things. Here is the summary of each issue and you can see there is a command and there is estimated saving as well.
So here this is the first one oversized virtual machine which is on payments engineix virtual machine. It is saying right now we are using standard D4 ASV7 payments EngineX virtual machine runs on okay this one with four uh vCPUs but appear suited to a lightweight workload engine simple API and web proxy so just for engineix and simple API we created four CPUs it is saying you don't need it you can just go with standard B2S let's see if it is true so I'll go to my Azure cloud platform Let me put this here.
Okay. Let's see what is it. What it is saying is true. I'll go to payments aqua uh resource group. I will go to the virtual machine that is talking about payments engineext VM.
Payments enginex VM. Okay, this is the one. What is the configuration of it?
So, its current configuration is D4 ASV7. Exactly what it mentioned. And it says we are underutilizing the resources. Let's see. I'll go to monitor.
What is the CPU utilization?
Oh yes. So you can see the CPU utilization is very very less.
Percentage of CPU that is used and even the memory utilization is very less. So that's why it recommended us to move to standard B2S. I think this comes with one CPU. Perfect. Very good option. Uh very good cloud detection, the cost detection. Then it says orphan disk even with payments orphaned uh disk 2 you are going with premium LRS but it says you can just go with data disc 2 you don't have to go with the premium t and here you can save 20 to$50 US let's say again if it is true payments orphan disc 2 I'll go back in the same resource group I should see uh payments are dis two okay this is the one And it is using the tire. Let's look at the tire. Premium SSD LRS. What it mentioned is true. It is of size 256GB, but our utilization is almost zero. So, it is saying you don't need this premium LRS. Very good. Again, a good suggestion. Similarly, uh it talks about unused static public IP addresses. It says you have this static public IP address payments unused pip. You can just remove it. Okay. And this will save $5 per month.
Overlay permissive NSG. And it also provided this is high severity. This is medium severity. Additional resources.
It also provided some additional resources that we can go through. So our analysis report and our AI cloud cost detective is working as expected. And I'm happy with the user interface as well. I thought a cursor will be generating something which does not look good. It happened to us last time but this time it generated pretty well and I'm happy with it. Now let's see one overall like end to end because we tested this functionality in multiple levels and we gave it multiple prompts. Now that our agent application is ready, let's view it in one end to end go. I will sign out of this application and now let's see end to end flow.
So on my Azure cloud platform I have two resource groups and in each resource group I created multiple resources which would cost me around 50,000 rupees but most of these resources are misconfigured. I mean these resources are configured in cost inoptimized way.
Most of these resources are overprovisioned.
Let's see we build this AI cloud cost detective if it will detect the overprovision resources and it will help us in saving the money. So I just signed in this is a cloud cost detective dashboard. It already looked at different resource groups I have. It figured out I'll select one resource group. Click on run analysis. It is fetching the resource group information.
It is scanning the resources. It is analyzing the cost with AI. It is storing the results in the Postgress database. And finally, the analysis is complete. It will provide us with the detailed report. This is the analysis report. So, it says we have 10 resources in the resource group. It found six issues and it can save $200 to $500 just if we follow the steps that it provided. First of all, it identified the virtual machine is overprovisioned.
It says run this command so that you can switch to standard B2S. Then it says there is an orphan disk. Delete the disk. It says there is another orphan disk. Go ahead and delete it. It says there is one static public IP address that is unused. Also delete the static public IP. It also provides information about overprovisioned NSG and oversized OS risk. It also has the information where if I click on the history, these are the previous analysis reports. If someone wants, they can go here and look at the previous analysis.
Cool. So this is our AI cloud cost detector. I hope you liked it. You can build this for AWS, Azure, GCP.
Everything remains the same architecture and request flow as well. So go through the GitHub repository, go through the prompts that I have provided, just modify it accordingly for AWS or GCP. If you like this video, please share it. If you have any questions, do let me know in the comment section. Even if you find it good, let me know in the comment section. See you all in the next video.
Take care.
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











