LLMModels & ArchitectureUpdated 2026.04.28

Token

Also known as토큰화Tokenization

In one line

A token is the basic unit an LLM reads and writes — usually a word or piece of a word. LLM pricing and context limits are all measured in tokens.

Going deeper

A token is the smallest unit an LLM operates on. It is not the same as a character or a word, and every model uses its own tokenizer. In English, one word is typically 1 to 2 tokens; in Korean, a single character can take 1 to 3 tokens. Frequent words compress into fewer tokens, while rare strings get split into many. Because LLM pricing, context limits and throughput are all measured in tokens, this is the first unit you have to internalise to run anything in production.

Technically, tokenizers usually rely on Byte Pair Encoding or SentencePiece. Those algorithms merge frequent character sequences into single tokens and split rare ones into smaller pieces. 'ChatGPT' might be a single token; an obscure neologism could be four or five. Korean costs more tokens than English mostly because it appears far less often in tokenizer training data, so the algorithm has fewer chances to compress it.

Two things matter for marketers. First, AI sees your copy as tokens, not as words. Compound words may stay glued together as one token or get split into several, and that subtly affects how the model picks up meaning. Second, since tokens map directly to cost, 'expressing the same idea in fewer tokens' is a real operational lever — concise system prompts and tighter content reduce bills meaningfully without changing what the user sees.

A common misread is to treat tokens and words as interchangeable. Punctuation, whitespace, emoji, numbers and even line breaks all consume tokens. Heavy formatting in an LLM response inflates token count, which inflates cost. Another misread is to assume token limits translate cleanly into character limits — a 'one million token context' holds roughly half as much effective Korean as effective English, so back-of-envelope math by characters consistently underestimates real usage.

Tokens are the single biggest cost variable for the Korean market. The same content costs 1.5 to 3 times more tokens in Korean than in English, which means a Korean RAG, chatbot or summariser can produce a bill two to three times higher than its US equivalent at the same traffic level. Korean LLM teams routinely shorten system prompts, tighten chunks, summarise upstream and cache aggressively to keep costs sane — and the same discipline applies to any production GEO system serving Korean users.

Related terms

How does your brand show up in AI answers?

Villion measures how your brand appears across ChatGPT, Perplexity and AI Overviews, then automates the work that lifts citation rate and share of voice.

Get a free audit