ClaudeBot
In one line
ClaudeBot is Anthropic's web crawler used for training Claude and grounding its answers — manageable via robots.txt.
Going deeper
ClaudeBot is Anthropic's official crawler, used for training and for grounding Claude's answers. Control it explicitly with a 'ClaudeBot' rule in robots.txt.
Claude is embedded across consumer chat, enterprise assistants and developer tools, so being in its citation pool reaches well beyond a single chatbot surface.
Anthropic publishes bot policies relatively openly. The currently documented identifiers are ClaudeBot, Claude-User and Claude-SearchBot — covering training, live user responses and search indexing respectively. Read the docs once and apply a consistent policy across all three.
Sources
Related terms
GPTBot
GPTBot is OpenAI's official web crawler used for ChatGPT training and search indexing — controllable via robots.txt.
GEO·AEOPerplexityBot
PerplexityBot is the web crawler Perplexity uses to gather sources for its answer engine — controllable separately via robots.txt.
GEO·AEOGoogle-Extended
Google-Extended is the separate user agent Google uses for training Gemini and Vertex AI, letting site owners control AI training access independently from regular search indexing.
GEO·AEOllms.txt
llms.txt is a proposed text file placed at the site root that tells large language models where the most important content lives — think 'sitemap, but written for LLMs'.
GEO·AEOGEO
GEO (Generative Engine Optimization) is the practice of optimizing content and data so that a brand gets cited and recommended inside generative AI search answers like ChatGPT, Perplexity and Google AI Overviews.