LLMInference & InterfacesUpdated 2026.04.28

Temperature

Also known as샘플링 온도Sampling Temperature

In one line

Temperature is a parameter that controls how much randomness the LLM allows when picking the next token — lower values give consistent answers, higher values give more creative ones.

Going deeper

Temperature controls how much risk the model takes when picking the next token. Near 0, you get almost the same answer every time; above 1.0, output gets creative but inconsistent. Half the reason 'why do I get different answers every time' is this single knob.

The right value depends on the use case. Run customer support and FAQ assistants at 0.2–0.4 for consistency; brainstorm marketing copy at 0.8–1.0 for variety.

In practice, recommended values differ per model, and some recent models (certain GPT-5 variants, for example) hard-fix temperature at 1.0 and reject other values. So 'lower equals more accurate' is a useful heuristic, not a guarantee — check the docs.

Related terms

How does your brand show up in AI answers?

Villion measures how your brand appears across ChatGPT, Perplexity and AI Overviews, then automates the work that lifts citation rate and share of voice.

Get a free audit