Skip to content

Determinism (design notes)

Determinism is a design constraint for auditability, not a promise that two LLM runs will always yield byte-identical text.

What “deterministic” means here

A run is structurally deterministic when:

  • the pipeline phase order and allowed transitions are fixed,
  • the run configuration is captured and fingerprinted,
  • all trace-critical metadata is recorded,
  • the system can classify replayability honestly.

The system treats LLM sampling as inherently unstable unless explicitly constrained.

Replayability classification

The trace header records a replay_status:

  • REPLAYABLE when model_metadata.temperature == 0.0
  • NON_REPLAYABLE otherwise

A trace may still be auditable even when it is non-replayable.

Deterministic vs observational trace fields

Some fields exist for operations and debugging, but cannot be stable across runs (e.g. timestamps). The trace schema classifies fields as:

  • deterministic: must be stable for replay validation
  • observational: allowed to drift (e.g. start_time, end_time)

Consumers should avoid asserting on observational fields.

Practical guardrails

If you want maximal determinism:

  • set model_metadata.temperature: 0.0
  • avoid time-dependent prompts and external calls
  • treat inputs as immutable snapshots (the CLI already writes a file snapshot for API inputs)
  • do not mutate pipeline phase composition at runtime

For the normative version of these rules, see docs/spec/invariants/determinism.md.