Ahrefs provides a lucid technical framework that demystifies AI search by connecting traditional SEO authority with the mechanics of RAG and query expansion. It successfully moves the AEO conversation from vague speculation to actionable, data-driven strategy.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
How AI Search Engines Work | 1.1. AEO Course by AhrefsAdded:
Hey, it's Samo and welcome to the first module which is on how AI search actually works. In this lesson, I'm going to walk you through the three things you need to understand about how AI search engines find, evaluate, and site content. Because here's the thing, if you don't understand how AI search works under the hood, every optimization tip you hear is just going to feel like a random list of tactics. And I don't want that for you. I want you to understand the why behind every strategy we cover in this course. Let's get started. So where does AI actually get its information. This is probably the most important distinction to understand. AI search engines have two sources of information and they work very differently. The first is training data. This is the massive collection of text that the AI was originally trained on. So books, websites, PDFs, social media, YouTube transcripts, basically a snapshot of the internet. So when you ask ChatGpt, "Who is the CEO of Apple?"
and it instantly says Tim Cook without searching anything, that's coming from training data. It already learned that pattern. But here's the problem with training data. It's static. It gets updated maybe every 6 months or so. So if you launched your product last week, the AI doesn't know about it yet. not from training data at least. And that's where the second source comes in, real time retrieval. This is where rag or retrieval augmented generation comes into play. It sounds complicated, but Patrick Stock said it best. Uh then you've got the the retrieved pages, which is like a secondary process. So you've got the the trained LLM data and then you've got the data where it goes out and fetches a bunch of relevant pages and those come with other probabilities. For example, when ChatGpt or Google's AI mode needs fresh information or when the question is too specific for training data alone, it goes out, it searches the web using APIs, it pulls back a bunch of pages, reads through them, and then generates a response based on what it found. Now, why does this matter for you? Because it means there are two ways to influence what AI says about your brand. The first is to be mentioned so widely across the web that you're baked into the training data itself. And second is to make sure your content shows up when the AI searches the web in real time. And guess what? We already know how to do that.
That's SEO. The skills you already have from traditional SEO, ranking in Google, earning backlinks, creating quality content, those directly influence whether AI picks up your pages during real-time retrieval. Now, here's where it gets interesting. The AI doesn't just search for the exact thing you typed in.
Let me explain. Search engines used to work one to one. One query, one set of results. Then they evolved to many to one where different queries like Sydney plumber and plumbing service in Sydney could return the same results. But AI search has flipped the model to one to many. One search gets expanded into many. And this technique is called query fo. For example, when someone enters a prompt like, "Plan me a 5-day trip to Japan in November." The AI fans it out into dozens of smaller longtail subqueries. Things like, "Best neighborhoods to stay in Tokyo, November weather in Kyoto, Japan Rail Pass worth it?" All running simultaneously behind the scenes. It then pulls information from multiple sources across the web and combines it into one complete answer. In fact, research from Seir Interactive in Nective found that the average prompt triggers 9 to 11 fan out queries with some going as high as 28. And ChatGpt's deep research mode ran 420 searches for a single query about buying a red phone case. So, if your content ranks for those niche specific queries, your brand has a much better chance of being included in the AI's final response. And this is a huge shift from traditional SEO where you could optimize one page for one target keyword and call it a day. In AI search, you need to be relevant across an entire topic and I could even argue across an entire niche.
Because if your page about how to start a podcast only covers the basics but doesn't mention equipment, hosting, or promotion, the AI is going to find someone else's page that does. Now you might be wondering, can I see these fanout queries? You sure can. In the AI responses report in HFS brand radar, you can see the fan out queries for chat GPT and perplexity props. But there's an important caveat. Despina from HFS wrote in her guide on query fan out that these aren't like traditional longtail queries. They're synthetic, generated by AI in the moment. They're inconsistent.
The same prompt can trigger different fanouts every time. and over 95% of them have zero search volume because real humans would never type them. So don't think of fano queries as a new keyword list to optimize for. Think of them as a window into what topics the AI considers important for a given question. We'll get into exactly how to use this strategically in module 2 because for now we need to talk about how AI decides who to actually site. In traditional search rankings are relatively stable.
Like if you're number three for a keyword today, you're probably going to be somewhere around there tomorrow. But AI citations are probabilistic. Patrick Stocks explained this really well. He said, "AI outputs are built on probabilities on top of probabilities on top of probabilities. The training data creates patterns. The retrieve pages add their own signals and then there's a temperature setting that introduces randomness so the AI doesn't generate the exact same answer every time." Now, what that means in practice is that if you ask the same question five times, you might get cited three out of five.
Or the AI might mention your competitor twice and you twice and someone else once. There's no fixed position to rank for. This is why we talk about AI visibility rather than AI rankings. It's more like a probability distribution than a leaderboard. Now, that said, there are patterns in what gets cited more often based on the data we've studied at HRES. Consensus matters. If multiple sources on the web say the same thing about your brand, AI is more likely to repeat it. And the more places your brand is mentioned in a consistent way, the higher the probability the AI picks it up. Freshness matters. AI cited content tends to be about 25% fresher than what you'd see in a traditional SER. The AI is actively looking for recent information, especially for topics that change. Authority still matters. Pages that rank well in traditional search have a major head start. Our data shows that 76% of AI overview citations come from pages already in the top 10 of Google. So, I'll hit that gong one more time. SEO is the foundation of AEO, but it's not only about Google rankings. 14% of pages cited in AI overviews don't rank in Google's top 100 at all. And for platforms like Chad GBT, the overlap with Google's results is even lower. So there's real opportunity for brands that aren't dominant in traditional search to still show up in AI. So let's tie all of this together. AI search engines pull from training data and real-time web search. They expand your one query into dozens of subqueries through fan out.
They merge and score results from all of those searches. and then they generate a response that's probabilistic, not fixed, based on patterns, consensus, freshness, and authority. Understanding this process is what makes every other lesson in this course make sense. So, when I tell you to earn more brand mentions, it's because of how consensus drives probability. When I talk about topic coverage, it's because of query fan out. When I say that SEO is the foundation again, it's because 76% of citations come from pages already ranking well. Now that you understand the mechanics, the obvious next question is, do all AI platforms work the same way? And the answer is >> not even close.
>> Not even close.
>> In the next lesson, we're going to compare AI overviews, chatbt, perplexity, and Google's AI mode side by side. And the data on how different they are is pretty surprising. I'll see you in the next lesson.
Related Videos
OpenHuman VS Hermes AI: Who Wins?
JulianGoldieSEO
285 views•2026-05-29
BREAKING: Microsoft’s New Image Generating Model Beat Out GPT 1.5 and Nano Banana 2
aimmediahouse
122 views•2026-06-03
Long-Running Agents — Build an Agent That Never Forgets with Google ADK
suryakunju
142 views•2026-05-30
This computer is made from real human brain cells. And you can buy it.
Talktmsmedia
3K views•2026-05-28
I Made the Same Anime Fight Scene in Every AI Video Generator
NobleGooseAnime
295 views•2026-05-30
Nvidia Bets Big On AI PCs | New Chip To Power Windows Laptops | Technology | AI Updates | N18S
cnnnews18
3K views•2026-06-01
I Tested NEW Opus 4.8 on Four Projects (Updated LLM Leaderboard)
AICodingDaily
298 views•2026-05-29
3D Platformer Update - NO CAPES
SolarLune
294 views•2026-05-30











