Usage¶
Usage¶
CLI¶
Process documents, build indexes, retrieve, and ask via the CLI:
# Ingest and chunk CSV docs (outputs msgpack chunks)
bijux-rag process --input data/arxiv_cs_abstracts_10k.csv --output artifacts/chunks.msgpack --chunk-size 512
# Build index from chunks (BM25 or vector)
bijux-rag index-build --input artifacts/chunks.msgpack --output artifacts/index.msgpack --backend bm25
# Retrieve top-k matches
bijux-rag retrieve --index artifacts/index.msgpack --query "functional programming in RAG" --top-k 10
# Ask with grounded response (citations from retrieved)
bijux-rag ask --index artifacts/index.msgpack --query "explain RAG effects" --top-k 5 --format json
# Run eval suite (pinned corpus/queries)
bijux-rag eval --suite tests/eval --index artifacts/index.msgpack
--backend bm25|numpy-cosine(deterministic profiles).--embedder default|customfor vector indexes.--filter key=valuefor metadata filtering (AND).- See
bijux-rag --helpfor full options.
Library¶
Build composable RAG pipelines programmatically:
from bijux_rag.core.rag_types import RawDoc
from bijux_rag.domain.effects.async_ import async_gen_from_list, async_gen_map
from bijux_rag.policies.chunking import fixed_size_chunk
from bijux_rag.pipelines.embedding import embed_docs
from bijux_rag.rag.app import RagApp
from bijux_rag.result.types import unwrap
docs = [RawDoc(doc_id="1", title="RAG Intro", abstract="Retrieval-Augmented Generation combines search and LLMs.")]
chunks = list(async_gen_map(async_gen_from_list(docs), fixed_size_chunk)) # Streaming chunking
embedded = list(embed_docs(chunks)) # Embed pipeline
app = RagApp() # Configurable app
index = app.build_index(embedded, backend="bm25").unwrap()
retrieved = app.retrieve(index, query="what is RAG?", top_k=5).unwrap()
answer = app.ask(index, query="explain RAG", top_k=5, rerank=True).unwrap()["answer"]
print(answer)
Leverage effects for resilience: wrap in retry_idempotent or async_with_resilience.
API (FastAPI)¶
Launch the HTTP server:
uvicorn bijux_rag.boundaries.web.fastapi_app:app --host 0.0.0.0 --port 8000 --reload
Endpoints (v1 prefix): - GET /health → {"status": "ok"} - POST /index/build (JSON docs array, backend, chunk params) → index ID. - POST /retrieve (index_id, query, filters?, top_k) → ranked results. - POST /ask (index_id, query, top_k, rerank?) → grounded answer with citations. - POST /chunks (docs, chunk_size, overlap) → chunked output.
Full schema in docs/reference/http_api.md (OpenAPI compliant).