Skip to content

Upgrades and Rollback

Atlas upgrades should preserve two invariants:

  • contract-owned surfaces remain understood and validated
  • serving state stays recoverable if a rollout goes wrong

Upgrade Flow

flowchart TD
    Validate[Validate runtime and contracts] --> Rollout[Roll out new runtime]
    Rollout --> Observe[Observe health and load]
    Observe --> Keep[Keep rollout]
    Observe --> Rollback[Rollback if needed]

This upgrade flow keeps rollout discipline visible. Atlas upgrades should be validated, observed, and explicitly kept or rolled back rather than treated as one-way jumps.

Rollback Flow

flowchart LR
    Problem[Operational problem] --> Scope[Determine runtime vs store scope]
    Scope --> RuntimeRollback[Rollback runtime]
    Scope --> StoreRollback[Rollback serving state if required]

This rollback flow explains one of the most important operator distinctions in Atlas: not every incident needs store-state rollback, and not every rollback should start there.

Operator Guidance

  • separate runtime rollback from store-state rollback in your thinking
  • verify health, readiness, and key query paths after rollout
  • keep rollback paths explicit before you need them
  • use compatibility and contract evidence as rollout input, not only hope and manual spot checks

What to Watch During Upgrade

  • readiness instability
  • unusual rejection or error patterns
  • metrics or traces indicating saturation changes
  • catalog or dataset discoverability regressions

Rollout Question That Saves Time

Ask first whether the change affected runtime behavior, serving-store state, or both. That answer usually determines the safest rollback path.

Purpose

This page explains the Atlas material for upgrades and rollback and points readers to the canonical checked-in workflow or boundary for this topic.

Stability

This page is part of the canonical Atlas docs spine. Keep it aligned with the current repository behavior and adjacent contract pages.