Ingest Workflows¶
Ingest is the workflow that turns source inputs into validated Atlas build output.
Ingest Pipeline¶
flowchart LR
Inputs[GFF3 + FASTA + FAI] --> Validate[Validate input set]
Validate --> Normalize[Normalize and classify]
Normalize --> Build[Emit derived artifacts]
Build --> Verify[Validate dataset root]
This ingest pipeline diagram is here to show that Atlas ingest is more than file copying. The step produces validated derived output and quality signals that later workflows depend on.
Important Ingest Inputs¶
- GFF3 annotation input
- FASTA reference input
- FAI index input
- release, species, and assembly identity
- strictness and policy-related options
Strictness Matters¶
flowchart TD
Strictness[Strictness mode] --> Strict[strict]
Strictness --> Compat[compat]
Strictness --> Lenient[lenient]
Strictness --> ReportOnly[report-only]
The strictness mode changes how Atlas responds to problematic input conditions. Use stricter modes unless you have a clear reason not to.
This strictness view helps users choose intentionally. A looser mode may be useful for exploration, but it changes the meaning of a “successful” ingest.
Why Ingest Output Is a Build Root¶
The output of ingest is not automatically the serving store. It is validated build state containing derived artifacts and quality signals.
That distinction is what allows Atlas to:
- apply publication gates
- keep serving state explicit
- prevent accidental runtime drift from raw ingest output
Example Ingest Command¶
cargo run -p bijux-atlas --bin bijux-atlas -- ingest \
--gff3 crates/bijux-atlas/tests/fixtures/tiny/genes.gff3 \
--fasta crates/bijux-atlas/tests/fixtures/tiny/genome.fa \
--fai crates/bijux-atlas/tests/fixtures/tiny/genome.fa.fai \
--output-root artifacts/getting-started/tiny-build \
--release 110 \
--species homo_sapiens \
--assembly GRCh38
After Ingest¶
Always do these next:
- validate the build root
- verify the build root if needed
- publish into a serving store
- promote into the catalog
What Ingest Alone Does Not Prove¶
- that the runtime can discover the dataset
- that the serving store has been populated correctly
- that catalog state now points to the new dataset
Practical Advice¶
- use committed fixtures for experimentation before using real inputs
- keep output roots under
artifacts/ - treat report-only and lenient modes as intentional exceptions, not the default
Purpose¶
This page explains the Atlas material for ingest workflows and points readers to the canonical checked-in workflow or boundary for this topic.
Stability¶
This page is part of the canonical Atlas docs spine. Keep it aligned with the current repository behavior and adjacent contract pages.