This systematic framework effectively mitigates the risk of subtle AI hallucinations by transforming blind trust into a structured auditing process. It is a pragmatic necessity for high-stakes tasks where the illusion of correctness is far more dangerous than an obvious error.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
The Most Dangerous AI Answer Is the One That's Almost RightAdded:
The most dangerous AI answer isn't the one that's completely wrong. That one's easy to catch. The dangerous one is the answer that's almost right. Clean writing, most facts are correct, but some of those claims aren't true.
If you're new here, I'm Dylan. I run an AI consultancy and most of my coaching clients deal with this exact problem.
So what I'm going to do is I'm going to walk you through a four-step process for catching those hidden errors because simply asking AI, "Are you sure?"
doesn't actually work. So let's get into it. Now here's a simple example of what all my clients experience when running through this process to identify where the AI may have slightly lied or misrepresented something from a source.
So in this example, you can see the AI is extremely compelling in what it wrote. So the company grew revenue by 18% last year. Growth was mainly driven by enterprise customers. Churn sits at 6% for the period. The sales team became more efficient and cash position remains healthy at that quarter end.
Now all this sounds good. Even the percentages are correct. But the issue here is when you look a little bit closer and you run this process, you realize that some of these claims, the ones that are highlighted, are either incorrect or misrepresented.
And the only way you can know this is if you're extremely close to that specific activity so you can identify yourself or you run this process with the assistance of AI to find those uh those hidden um those hidden claims without the proof.
And this here summarizes the entire process. So instead of you eyeballing the answer as a human, you can have the AI extract all the claims one by one and then have it validated it based on the source and then you can spot check afterwards. Before we get into the four-step process, I do want to call out that you don't necessarily have to use this all the time. 90% of the use cases aren't needed for this detailed audit process. Reason being is that this is really optimized for those tasks that are high stakes, either financially, legally, or whatever else. Here are just a few examples that I commonly see where maybe you're reviewing a contract and you want to make sure what the AI pulled out and the advice it gave is valid.
Another one is you're doing due diligence either on a vendor or a company you want to invest in or there's a vendor proposal that you're drafting or reviewing and you want to make sure that it's uh valid based on the comparisons the AI's done. All of these have some sort of significant financial or legal stakes, maybe even brand reputation for your company, and you want to make sure that whatever the AI's giving you back is valid. So moral of the story is don't run this all the time for all your use cases because that would just be a waste of time. Quick pause in your regular programming. This video is brought to you by me, as always. Two quick things. First off, below is a 30-day AI insight series, completely free. You'll get 30 insights in your inbox so you can apply AI to your business and your work. The second thing is if you'd like to work with me, below are a series of offerings to see if there's a good fit between the two of us. Now, let's get back in the video.
Now what is the four-step process? Well, I'm going to give you a quick overview here, but then we're going to walk in detail in each one of these steps. So the first step of the process is simply finishing whatever you're doing with the AI. So you're trying to create something with the AI, either a document, an Excel sheet, a PowerPoint, or whatever else.
Finish that and get it to the point where you're happy with it and you would send it off. Once you've completed that, then you go to the next step, which every single one of these steps are going to be fresh conversations with a completely different AI. The reason we're doing that is we want to make sure that each one of these, there is no bias leaking into the next AI because the bias from the previous conversation can bias the AI's opinion on what it's splitting, checking, etc. So make sure you're starting new conversations each time. In the second step, we're going to split out the claims and the third third step, we're going to check the claims against the source information. And then finally, we're going to rewrite it with that understanding. One thing I'll call out here is for really advanced use cases where the stakes are extremely high, what you can do is if you'd like, is use different models each time. So maybe the finish step, you're using Claude Opus 4.7. For the split step, maybe you use GPT 5.5. For the check step, maybe you use Gemini 3.1 Pro. And then for the rewrite step, you go back to Claude Opus 4.7. The reason that could be useful for these really high stakes tasks is each AI has its own strengths and weaknesses as well as biases. So if you have different AIs doing different steps, there's a higher likelihood you're going to catch some things that if you did it with one AI, wouldn't catch. But again, this is for the extreme extreme use cases, so like the 1%. All right, so let's move on to the first step. So which is finishing the write-up or creating whatever the thing is you want the AI to create for you.
And this is simple. So you're going to go through the process of creating that specific artifact. So it could be an Excel sheet, a PowerPoint, a document, whatever it is. Through this process of creation, you're obviously going to improve it over time, so you're going to iterate until you're happy with the output. Once you've gotten to the point where you're happy with the output and you want to ship this to somebody, before you do that, pause and ask yourself, is this a high stakes task? If so, should I run this through the audit process? If the answer is yes, then and only then will you then run the audit process, not before, not after. Which takes us to the next step, which is taking that output from the AI and breaking it down into the foundational claims. And here's a simple example of what that looks like. So in this sentence, we have two foundational claims. In the first part, we're saying revenue grew 18% year over year mainly because the company added more enterprise customers. These are two claims, so we're going to split those out into their their core components.
That's what this AI is doing. And this here is the prompt that you can copy and paste and use yourself to do this because the AI is going to take the claims out for you.
So remember, this is going to be a fresh conversation after we finish the first step. And this prompt is very straightforward. You can add complexity as you see fit, but the way that I set it up is at the very beginning, we're telling the AI, "I want you to break this write-up into small factual claims."
A claim is one fact that can be checked.
I then mention it's important that you list out all the factual claims only.
Include advice, tone, wording choices, etc. that contain factual claims. And then down here, I tell it what the table should look like. So the output format here is create a table with three columns. We have the claim number, the exact claim, and then what source you actually pulled it from that can prove this. And then at the very end, all you have to do is copy and paste and or attach the previous output from the last conversation for the AI then to process. And with this step, and probably with all the steps, you want to use a high-end model. So you're going to use the highest intelligence model you have access to such as Opus 4.7, GPT 5.5, etc. Once we've split out all the claims, we would then move on to the third step, which again is a new conversation. We're going to take those claims and have the AI check them against the source. And when it does the checking, it's going to check for four different categories of things.
So the first category is if it is supported and the claim has the source and it proves it, then you label it as supported because we're going to keep this. That's the associated action.
After this, the next label is conflicts.
So if the claim itself in the document that's been extracted conflicts directly with what's in the source, we're likely going to replace that with the source information.
After that, we have no proof. So if the AI identified that the source document doesn't include this claim, then we're going to likely remove it. And then finally, we have needs human judgment.
So this one's important. Often times it's overlooked, but we want the AI to identify these where the claim itself could be a prediction, it could be something associated to context in the marketplace that the source document doesn't have, but the human likely does.
It needs to identify these categories because you as a human need to determine if you want to keep it or remove it or make updates associated. And those are four categories.
And this is what it looks like inside the actual prompt. Again, you can copy and paste this and use it for yourself in this phase. It's going to be in a new chat with a high-end model. And at the very beginning of the prompt, we're simply telling the AI that your goal here is to check all the claims against the source material that I provide. I want you to treat the source material that I provide as the full evidence. Use nothing else but that. When you're doing this process, you're going to use four labels that we just mentioned previously to tag each one of the claims associated to the source information. I've already walked you through the four labels, that's what those are there. And then below this, we then tell the AI what we want back from it in the audit report.
We're stating that each claim needs to have the label associated, the exact source line or short quote that supports that claim and that label. And then in one sentence, some reason as to why you gave it this label so I can quickly spot check it myself. After that, all you have to do is either attach the claims themselves or the source or you can copy and paste them in directly, however you'd like to do it depending on how big it is. And after step three, now after this, we have an audit. So we have an audit report that includes all the claims, the associated labels, and the source information. Now it's time that we take that audit report and we rewrite our initial crafted piece with this information. So again, we're going to start a new conversation. We're going to copy and paste this prompt into that so the AI can then rewrite that piece with this new findings. So in this prompt, at the very top, I'm simply telling the AI, "I want you to rewrite the original write-up using the audit results below."
So this means that we have both the write-up and the audit results. In this statement here, we're actually grounding the AI. We're telling the AI, "I want you to only use the original write-up as the base and nothing else. So don't make up anything in your head or search online. Only use this document." I then make it explicit to the AI that I wanted to keep the same structure and style of the original because I appreciated it, I liked it, and I wanted to send it. I don't want it to change that. I just want it to change the factual claims associated to that information. And then I give it very specific rules of what it needs to do with those labels in that audit report. So I'm saying that if the claim states supported, you just keep it. If the claim states that it conflicts, then what you need to do is figure out what the source actually states and then reword it to make sure it's valid.
After that, if the claim has no proof, then either you remove it or soften the claim. And this is really up to you as the user, so you can probably scratch this part of the prompt if you want it to remove completely or if you're okay with it being softened, you keep it.
After that, at the very end, for the human piece, we're simply telling the AI that I want you to treat the claims that are needs human as an uncertain thing that needs human judgment. So again, the AI is going to leave this there for you so you can then determine if you want to keep it or remove it or update it. And finally, at the bottom of this, we're asking the AI to again ground itself in the original write-up and audit so it doesn't pull information from the internet or its own knowledge base. And then at the bottom, all you need to do is either attach these two files. So you're going to take your write-up or your audit and attach them or you're going to copy and paste them directly into the prompt, however you'd like to do it. And this is what a simple example of what that could look like. So before, we had a sentence that looked like this where you state the company grew revenue by 18% last year driven mainly by enterprise customers.
Churn is low and the sales team became more efficient. And after the process in the audit report, we identified that this specific claim was the AI overreaching and trying to fill in the gaps when it shouldn't have. And then this claim here where it says the sales team became more efficient is completely conflicting from what's in the source.
So after we've rewritten it with the AI, we changed this to state that we're uncertain as as why the revenue grew.
We're not sure if it's enterprise customers or something else. And for the sales team claim, we just have an update with the actual source information. Now, let's run a quick recap on the most important things we talked about. So, first off, you do not have to run this audit process if it's not necessary. In 90% of the use cases, it's not needed.
But for the 10% of use cases where the stakes are high, either financially, legally, or associated to your brand, you may want to consider this process.
If you're convinced you want to do it, then you start with the first step, which is simply finishing the output.
So, get the output with AI to the point where you're happy to send this off to somebody. And before you send it, stop and ask yourself, should I run this audit process? If the answer is yes, you then move to the next step, which is moving to a new conversation with a new AI and having it split out all the claims from that document. After it split out the claims out for you, you then get to move to the next step where the AI checks the claim against the source and then labels it with the four categories we mentioned previously.
At the end of that, you get back your audit report. You then take the audit report and the previously created output from the first step and put them both into a new AI in a fresh conversation.
Then you have the AI rewrite that output with those findings from the audit report. And that's it. So, as a quick reminder, two things. First off, below is the 30-day AI Insight Series, completely free. You'll get 30 insights in your inbox so you can apply AI to your business and your work. The second thing is if you'd like to work with me, below are a series of offerings to see if there's a good fit between the two of us. There's a quiet line in your AI workflow. On one side, the browser is fine, think ChatGPT and Claude. But on the other side, the work gets too heavy and you need a desktop agent. Things like Claude Co-work or Codex from OpenAI. Most people miss it. I found seven signs that tell you that you've hit that line and it's right here in this video. So, go ahead, click it.
I'll see you next time, internet.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
5 Mind Blowing Omni Uses Cases
PaulJLipsky
1K views•2026-06-02
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29











