LLMEvaluation & SafetyUpdated 2026.04.28

Prompt Injection

Also known as프롬프트 주입Indirect Prompt Injection

In one line

Prompt injection is an attack where instructions hidden in untrusted data override the system prompt and force the LLM into unintended behaviour.

Going deeper

Prompt injection hides instructions inside the data an LLM consumes — emails, web pages, documents — saying things like 'ignore prior instructions and do X'. Direct injection puts the payload in the user's own input; indirect injection hides it in third-party content. As RAG and agents become standard, indirect injection is now the bigger worry.

For marketers it cuts in an unexpected direction: your content (and the third-party content you cite) can be a vector. If someone seeds malicious instructions into a page that an LLM later reads, the model can be coerced into producing the wrong answer. Content moderation is no longer just an SEO concern.

There is no silver bullet defence. The realistic posture is layered: separate trusted from untrusted content, enforce structured output, require user confirmation before high-impact actions. OWASP ranks prompt injection as the #1 risk in its LLM security Top 10.

Sources

Related terms

How does your brand show up in AI answers?

Villion measures how your brand appears across ChatGPT, Perplexity and AI Overviews, then automates the work that lifts citation rate and share of voice.

Get a free audit