LLMInference & InterfacesUpdated 2026.04.28

Temperature

Also known as샘플링 온도Sampling Temperature

In one line

Temperature is a parameter that controls how much randomness the LLM allows when picking the next token — lower values give consistent answers, higher values give more creative ones.

Going deeper

Temperature controls how much risk the model takes when picking the next token. Near 0, you get almost the same answer every time; above 1.0, output gets creative but inconsistent. Half the reason 'why do I get different answers every time' is this single knob.

The right value depends on the use case. Run customer support and FAQ assistants at 0.2–0.4 for consistency; brainstorm marketing copy at 0.8–1.0 for variety.

In practice, recommended values differ per model, and some recent models (certain GPT-5 variants, for example) hard-fix temperature at 1.0 and reject other values. So 'lower equals more accurate' is a useful heuristic, not a guarantee — check the docs.

Related terms

LLM

A large language model (LLM) is a neural network trained on massive text corpora to understand and generate human language — the engine behind ChatGPT, Claude, Gemini and similar products.

LLM

How does your brand show up in AI answers?

Villion measures how your brand appears across ChatGPT, Perplexity and AI Overviews, then automates the work that lifts citation rate and share of voice.

Get a free audit

Temperature

Going deeper

Related terms

LLM

Token

Prompt Engineering

Structured Output

Function Calling

How does your brand show up in AI answers?