Fundamentals

Temperature

A parameter that controls the randomness and creativity of AI model outputs, with lower values being more deterministic.

What is temperature in AI?

Temperature is a parameter that controls how random or creative an AI model's outputs are. It affects how the model chooses from possible next tokens when generating text.

Low temperature (0.0 - 0.3):

  • More deterministic, focused responses
  • Picks the most likely tokens
  • Good for factual questions, coding, data extraction
  • Same prompt often gives same output

High temperature (0.7 - 1.0+):

  • More random, creative responses
  • Explores less likely tokens
  • Good for creative writing, brainstorming
  • Same prompt gives varied outputs

Think of it like adjusting a dial between "play it safe" (low) and "take risks" (high).

How does temperature work?

When generating text, LLMs predict probability distributions over possible next tokens. Temperature adjusts these probabilities before sampling.

Technical explanation:

Without temperature (or temperature=1):

  • Token "the" has 60% probability
  • Token "a" has 30% probability
  • Token "some" has 10% probability

With low temperature (0.3):

  • Probabilities become more extreme
  • "the" might become 95%
  • Model almost always picks "the"

With high temperature (1.5):

  • Probabilities become more uniform
  • Each option has more similar chances
  • Model might pick any of them

The formula: Adjusted probability = original probability^(1/temperature)

At temperature 0, the model always picks the highest probability token (greedy decoding).

How to choose the right temperature

Use low temperature (0.0 - 0.3) for:

  • Factual Q&A
  • Code generation
  • Data extraction
  • Classification tasks
  • Math problems
  • Anything with a "correct" answer

Use medium temperature (0.4 - 0.7) for:

  • General conversation
  • Summarization
  • Translation
  • Most business applications
  • Balanced creativity and accuracy

Use high temperature (0.8 - 1.0+) for:

  • Creative writing
  • Brainstorming
  • Poetry and fiction
  • Generating diverse options
  • When you want surprises

Temperature 0: Perfectly deterministic. Use for testing, debugging, or when reproducibility matters.

Practical temperature tips

Start in the middle Begin with 0.7 and adjust based on results. Too robotic? Increase. Too random? Decrease.

Consider your use case Customer support chatbots should be consistent (low temp). Marketing copy generators might benefit from variety (higher temp).

Combine with other parameters Temperature works with top_p, top_k, and other sampling parameters. Experiment with combinations.

Test with real examples Run the same prompt multiple times at different temperatures. See which gives the best balance for your needs.

Production vs development You might use higher temperatures during development to explore possibilities, then lower temperatures in production for consistency.

Don't go too high Temperature above 1.0 can produce incoherent or nonsensical outputs. Use extreme values carefully.