Skip to content

Concepts

This project exists to make LLM-backed document processing auditable. The core design choice is that a run must leave behind artifacts that can be inspected, diffed, and validated.

Pipeline as a contract

A run is an execution of the canonical pipeline (AuditableDocPipeline) over a context:

  • a goal (task_goal)
  • an input (text or a file_path)
  • a stable identifier (context_id)

The pipeline is canonical: its phase order and allowed transitions are fixed and versioned. Consumers should assume “same inputs + same config + same model settings” implies “same trace classification”, not “same model output”.

Phases

The canonical lifecycle is:

INIT → PLAN → EXECUTE → JUDGE → VERIFY → FINALIZE → DONE

Not every phase necessarily produces user-visible content today, but the lifecycle exists so traces have a stable semantic scaffold.

Artifacts and why they matter

A normal CLI run writes two primary artifacts:

  • result/final_result.json — a compact verdict summary (what you would read first)
  • trace/run_trace.json — the trace (what you audit and validate)

The trace is designed to support:

  • post-hoc debugging (“why did we decide PASS?”),
  • regression detection (compare fingerprints across versions),
  • replay validation (“does this trace satisfy the contract?”).

Determinism vs replayability

Determinism is treated as a classification, not a promise that LLM sampling will match across time.

The system marks traces:

  • REPLAYABLE when model_metadata.temperature == 0.0
  • NON_REPLAYABLE otherwise

Replayability controls what the tooling is allowed to claim, not what the user is allowed to do.

Failure semantics (high-level)

Failures are represented by an immutable FailureArtifact:

  • a machine-actionable class (failure_class)
  • an operational/epistemic category
  • recoverability (whether the orchestrator may attempt recovery)

See the formal taxonomy: docs/spec/failure_model.md.

Where to go next

  • For hands-on usage: docs/user/usage.md
  • For the formal contract: docs/spec/read_this_first.md