How AI Search Actually Works — Seven Years of RAG Evolution and a Unified SEO·GEO Strategy
From the four-stage pipeline ChatGPT uses to find answers, to the foundational papers behind it (Transformer, REALM, DPR, RAG), to the new techniques since 2024 (HyDE, Self-RAG, GraphRAG, Agentic RAG). A look at how AI answers work — drawing on OpenAI's own documentation — and what marketers should do now.
Villion Team
·15 min read

Key takeaways
- •ChatGPT Search was made available to all users on December 16, 2024 and auto-triggers web search to compose answers with sources.
- •Today's AI answers descend directly from the 2017 Transformer and the 2020 REALM, DPR and RAG papers — with quality jumping further from 2024 to 2026 via HyDE, Self-RAG, GraphRAG and Agentic RAG.
- •OpenAI runs GPTBot (training) and OAI-SearchBot (search indexing) separately. If you want to be cited in search, you must allow OAI-SearchBot.
- •The strongest signal for AI citation isn't raw backlink count — it's corroboration: multiple independent authoritative sources stating the same thing consistently.
- •SEO is the foundation of GEO, not its substitute. Don't run the two axes separately — tie them into one operational flow for efficiency.
1. The moment AI search became everyday
December 16, 2024 is the date OpenAI opened ChatGPT Search to every user. No login required, free to use. ChatGPT decides for itself whether a question needs the web and surfaces sources alongside the answer. The Sources button under the response expands the sidebar of cited sites, and any click carries utm_source=chatgpt.com automatically.
Around the same time, Google's AI Overviews established itself as a feature used by hundreds of millions of people daily in the U.S., and Perplexity carved out a market position by refusing to answer without sources at all.
For marketers, the implication is clear. Users no longer compare ten blue links one by one. They decide based on the three or four brands AI shortlisted. Search has moved one layer inside, and inside that layer, “who AI cites” has become the new visibility frontier.
Below we trace how AI finds answers, going all the way back to OpenAI and Google's official docs and the original RAG papers. Knowing the mechanics is what lets you stop running SEO and GEO as separate programs and pull them into one operational flow.
2. How AI answers actually work — seven years of RAG evolution
Most GEO content stops at “AI works through RAG.” But RAG didn't drop from the sky in 2020, and ChatGPT Search today doesn't quite move like the 2020 RAG paper. Trace the seven-year arc and you can see where the answers you're looking at came from and where they're heading.

2-1. Where it all started — the 2017–2020 academic base
Transformer
Vaswani et al., NeurIPS 2017Self-attention lets every word attend to every other word at once. GPT, Claude, Gemini and Llama all run on variants of this design.
REALM
Guu et al., ICML 2020Retrieval-Augmented Language Model Pre-Training
Pulled retrieval into the pre-training stage. The model learned to re-look up Wikipedia while answering, lifting open-domain QA accuracy by 4–16%.
DPR
Karpukhin et al., EMNLP 2020Dense Passage Retrieval for Open-Domain Question Answering
Replaced keyword matching (BM25) with a dual-encoder embedding-based approach. Top-20 accuracy improved by 9–19 percentage points over BM25 — the direct ancestor of today's vector DBs.
RAG
Lewis et al., NeurIPS 2020Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Tied parametric memory (model weights) and non-parametric memory (external documents) into one inference flow. The paper introduced the name 'RAG' along with the RAG-Sequence and RAG-Token variants.
Spring 2020 is the truer “year zero” of RAG. Transformer had laid the base architecture; REALM dragged retrieval into the training loop; DPR redrew retrieval itself around embeddings; the RAG paper packed all of it into one inference flow. Today's vector databases (Pinecone, Weaviate, Milvus) and RAG frameworks (LangChain) are direct descendants of those four papers.
2-2. The modern four-stage RAG pipeline
The way ChatGPT Search or Google AI Overviews compose answers today compresses into what the OpenAI Cookbook calls the “Search-Ask pattern.” In four steps:
Embed
Place the user's question and candidate documents in the same vector space. On the OpenAI side, models like text-embedding-3-large convert text into thousand-dimensional numeric vectors.
Retrieve
Pull document vectors closest to the question vector by cosine similarity. Typically the top 10–20 are taken as the first-pass shortlist.
Rerank
Re-score the shortlist with a more precise model and keep only the top 3–5. Which document gets cited first in the answer is effectively decided here.
Generate
Feed the shortlisted documents as context and let the model compose the answer. A mapping of which sentence came from which document is generated alongside — the citations and sources you see are that mapping surfaced to the user.
The Cookbook makes a point of saying “fine-tuning is actually a poor fit for learning facts.” Analogizing model weights to long-term memory and the message context to short-term memory, it recommends RAG when you want accurate answers. ChatGPT Search itself is that principle implemented as product.
2-3. Recent evolution — techniques that pushed answer quality up (2024–2026)
The basic four steps aren't enough. When questions are vague, when multi-hop reasoning is needed, or when information is scattered across documents, plain vector search misses often. Techniques from 2024 onward filled those gaps.
HyDE
Hypothetical Document EmbeddingInstead of searching with the question directly, the model first writes a hypothetical answer and searches using the embedding of that answer. Text closer to the answer turns out to be closer to the documents containing the answer — accuracy on technical Q&A improved noticeably.
Self-RAG
Self-Reflective RAGAfter composing an answer, the model critically re-examines its own response. If evidence is weak or contradictions appear, it runs retrieval again — a self-verification loop embedded inside the answer flow.
GraphRAG
Graph-based RAG (Microsoft Research)Instead of chunking documents into flat passages, builds them into a knowledge graph. Entities and relations are explicit, enabling multi-hop reasoning. LinkedIn reported MRR up 77.6% and mean resolution time down 28.6% after adopting GraphRAG.
Agentic RAG
Planning + reflection-based RAGOn receiving a question, the system first plans the steps to the answer. It then retrieves multiple times as needed, calls external tools (SQL, APIs) and revisits intermediate results. Closer to a small agent doing the job than a single retrieval pass.
2-4. What ChatGPT Search really does
OpenAI hasn't published ChatGPT Search's exact internals. But combining observed behavior with the official help docs suggests the techniques above are stitched together. When a question arrives, the model first decides “does this need web search?” If yes, it pulls in content pre-indexed by OAI-SearchBot together with real-time web results and uses multi-step reasoning to assemble the answer. The Sources sidebar surfaces a visualization of which part of the answer came from which document — essentially the same concept Google calls grounding supports in Vertex AI documentation.
3. Which sites ChatGPT cites
OpenAI addresses this directly in its Publishers and Developers FAQ: “Ranking in ChatGPT Search is based on multiple factors to provide users with trustworthy, relevant information. There is no way to guarantee top placement.” It's not the kind of system where you reverse-engineer the algorithm to take #1 the way SEO once worked. But the preconditions for being cited are quite clear.
3-1. Understand the three crawlers separately
OpenAI runs three crawlers separately.
- ·GPTBot: training-data collection. Blocking it doesn't affect ChatGPT Search citations.
- ·OAI-SearchBot: ChatGPT Search indexing. Block it and you can't be cited.
- ·ChatGPT-User: explicit real-time fetches triggered by the user.
The point is spelled out in the SearchGPT announcement: “Search is decoupled from training. A site can opt out of training data and still appear in search results.” If training-data inclusion bothers you, block GPTBot only and keep OAI-SearchBot open.
3-2. The conditions that actually increase citations
Conditions called out directly in the Publishers FAQ:
- ·Allow OAI-SearchBot to crawl
- ·Don't have your site host or CDN block OpenAI's published IP traffic
- ·OpenAI also respects X-Robots-Tag and meta noindex beyond robots.txt
Signals observed in the field in addition:
- ·External citations from authoritative domains (corroboration)
- ·Structured data such as Schema.org
- ·Clear answer-first writing (answer-first structure)
- ·Freshness (signals validated in communities and news)
What OpenAI repeatedly emphasizes is not raw backlink count but corroboration — a fact becomes a citable signal only when multiple independent authoritative sources state it consistently.
3-3. Traffic is trackable
Every referral from ChatGPT Search automatically carries utm_source=chatgpt.com. Filter on it in Google Analytics or any analytics tool to isolate traffic coming in through ChatGPT citations.
4. Search intent (SEO) and answer intent (GEO) — the same and different
The most fundamental concept in SEO is search intent. Google's Search Quality Rater Guidelines split user intent into four: Know (want to know), Do (want to act), Website (want a specific site), Visit-in-person (want to go somewhere). You may know them better as informational, navigational, commercial and transactional. Ninety-nine percent of searches fall into these four.
The same four intents work for AI search too. What changes is what the answer looks like.
| Search intent | Traditional search result | AI answer form |
|---|---|---|
| Know (information) | Links to Wikipedia, blog articles | Definition or summary + cited sources |
| Do (action) | Tutorial and how-to sites | Step-by-step explanation + cited sources |
| Website | A single official site | Direct domain pointer |
| Commercial (compare) | Comparison content, review sites | 3–5 recommendations + comparison summary |
| Transactional (buy) | Product pages | A site you can buy from + price |
The biggest shift is in Commercial and Know. In classical SEO the formula was “informational content pulls traffic, transactional pages convert.” In AI search the recommendation sits inside the informational answer. A user searching for “GEO tool recommendations” gets three or four picked by AI. If your comparison content isn't in the source list of that answer, the brand isn't visible to the user at all.
This is why GEO has to be treated as a separate axis, not just “new SEO.” SEO is “visibility to be discovered”; GEO is “visibility to be chosen.” The two share the foundation but move very differently at the result stage.
5. Google AI Overviews vs ChatGPT Search
Same RAG principle, quite different implementations.
Google AI Overviews
Uses Gemini-based multi-step reasoning. Even when a question packs in multiple sub-queries, the model splits them itself. Vertex AI's grounding mechanism maps each answer segment to a source chunk — Google calls these grounding chunks and grounding supports. Google has stated publicly that links inside AI Overviews receive more clicks than typical search results.
ChatGPT Search
Uses OAI-SearchBot's pre-indexed content together with real-time web results. The Sources sidebar under the answer makes citations visible. The differentiator from Google is that explicit grounding data isn't exposed to the user — there's no precise visible mapping of which sentence came from which source.
To be cited in both systems at once, you ultimately stand on the same foundation: authoritative domains, Schema.org structured data, answer-first writing, and facts validated consistently across multiple independent sources. Both systems treat corroboration as the strongest signal.
6. Seven SEO·GEO unified actions
With the mechanics laid out, here's what to actually do.
Separate pages by search intent
Don't cram Know, Do and Commercial into a single page. Splitting informational and comparison content lets AI clearly read the page's intent. Stuff everything into one page and you get cited for none of them.
Paragraph-level self-containment
The unit GPT-class models lift into an answer is the paragraph, not the page. So each paragraph needs to make sense on its own with no outside context. Phrases like 'as explained above' or 'as we'll see below' break the meaning the moment the paragraph gets excerpted.
JSON-LD structured data
Schema.org JSON-LD works for SEO and GEO at the same time. Google uses it for Rich Results, and AI search systems use it for entity recognition and fact extraction. At minimum, Article, Organization, Product and FAQ should be in place.
E-E-A-T signals
Experience, Expertise, Authoritativeness, Trustworthiness — the same four count for AI citations. Author info (name, affiliation, credentials), source citations, last-updated dates and external media mentions are the core signals. Anonymous content and unsourced claims aren't trusted by any system.
Crawler policy hygiene
Allow OAI-SearchBot in robots.txt, and verify your CDN doesn't block OpenAI's IP traffic. Blocking GPTBot only affects training opt-out, not search citation. Manage Googlebot, OAI-SearchBot, ClaudeBot and PerplexityBot policies separately.
Multi-source corroboration
Make the same fact appear consistently across multiple independent authoritative sources. When your own site, Wikipedia, industry media, review platforms (G2, Capterra) and organic mentions on Reddit or Stack Overflow all line up, AI treats that fact as citable.
Citability measurement and monitoring
Even with all six above, without measuring which queries cite you in which answers, you can't improve. Measure your brand and competitors on the same query set weekly and track citation rate, citation position and mention context.
7. Extra variables for the Korean market
Korean marketers have two more variables to manage.
First, Naver still carries a large share of search. Naver maintains its own corpora from Knowledge-iN and blog indexes, and AI search systems frequently pull from Naver content when answering Korea-specific questions. Skip Naver SEO and visibility on Korea-related questions in AI answers drops with it.
Second, Korean-language LLM citations have their own tone and vocabulary. Whether you write a foreign proper noun in the source script or in Hangul, whether you use formal honorifics or plain register, whether you cite sources in parentheses or footnotes or inline — each of these choices affects citability. Translating an English article verbatim and writing in Korean from scratch produce very different citation shapes.
Skip these two and a GEO·SEO unified strategy stops at half-effectiveness in the Korean market.
8. Conclusion — SEO is the foundation of GEO
The whole story in one sentence: SEO is the foundation of GEO, not its substitute. The seven-year arc — from the 2017 Transformer through RAG and DPR to ChatGPT Search — never removed search. Search just moved one layer in, from “discovery” into “selection.” The same site assets do work at both stages.
Villion is an integrated platform that handles GEO, AEO and SEO in one solution. Diagnosis, content production, site-signal hardening and citation-rate measurement are tied into a single flow. The differentiator is that the seven actions above can be run by both the SEO and GEO sides of the team inside one tool, instead of as parallel programs.
Primary sources
- ·Vaswani et al., Attention Is All You Need (NeurIPS 2017)
- ·Guu et al., REALM: Retrieval-Augmented Language Model Pre-Training (ICML 2020)
- ·Karpukhin et al., Dense Passage Retrieval for Open-Domain QA (EMNLP 2020)
- ·Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (NeurIPS 2020)
- ·OpenAI — Introducing ChatGPT search (2024-12-16)
- ·OpenAI — SearchGPT prototype announcement
- ·OpenAI — ChatGPT Search Help Center
- ·OpenAI — Publishers and Developers FAQ
- ·OpenAI — Overview of OpenAI Crawlers
- ·OpenAI Cookbook — Question Answering Using Embeddings
- ·Google — How Generative AI is improving Search
- ·Google Cloud — Grounding gen AI in enterprise truth (Vertex AI)
- ·Microsoft Research — GraphRAG
FAQ
Frequently asked questions
Run GEO, AEO and SEO as one flow
Villion handles diagnosis, content production, site-signal hardening and citation-rate measurement on one platform.
Request a free diagnosis