Prompt Caching
In one line
Prompt caching reuses the computation done on a repeated system prompt or document so subsequent calls are dramatically cheaper and faster — a direct lever on operating costs for repetitive workloads like GEO monitoring.
Going deeper
Prompt caching saves the internal state computed for a repeated prompt — system prompt, fixed documents, long context — so that subsequent calls do not have to recompute it. OpenAI, Anthropic and Google each offer their own flavour. On cache hit, input token cost typically drops by 50–90% and time-to-first-token shrinks meaningfully.
For marketers it is a direct lever on AI operating costs. Workloads with heavy repetition — daily GEO monitoring runs that fire the same system prompt hundreds of times, in-house RAG that reuses the same document set every call — are exactly the cases where caching pays off.
Practical caveat: caches usually require the leading portion of the prompt to match byte-for-byte. The standard template is 'system prompt → fixed reference docs → variable user input', in that order, to maximise hit rate.
Sources
Related terms
System Prompt
A system prompt is the instruction sent to an LLM before any user message, defining the assistant's role, tone and rules — effectively the AI product's character.
LLMContext Window
The context window is the maximum number of tokens an LLM can take in at once — it defines how much content the model can consider in a single prompt.
LLMRAG
RAG (Retrieval-Augmented Generation) lets an LLM fetch external documents at answer time and ground its response in them — the technique behind ChatGPT Search, Perplexity and most AI search products.
LLMContext Engineering
Context engineering goes beyond crafting a single prompt — it is the design discipline of deciding which context to assemble and how to feed it to the model, an idea that crystallised in 2024–2025.
LLMLLM
A large language model (LLM) is a neural network trained on massive text corpora to understand and generate human language — the engine behind ChatGPT, Claude, Gemini and similar products.