ND Quality & Confidence¶
Why this exists¶
ND quality metrics tell you whether the results are stable enough to use. Without them, you might ship low-quality results that look fine in development but fail in production.
This section is for anyone who needs to make operational decisions about ND quality—engineers, analysts, and on-call responders.
Story / scenario¶
A recommendation model drifts after a data update. Metrics reveal a spike in instability, and the team rolls back before users notice.
What usually goes wrong¶
- Teams treat ANN results as “good enough” without measuring.
- Confidence labels are not used to gate downstream behavior.
How Bijux-Vex handles it¶
ND results include quality metrics and a confidence label so downstream systems can act safely.
What trade-offs remain¶
- Measuring quality costs time.
- Witness checks can be expensive at scale.
Where to go deeper¶
docs/user/nd_examples.mddocs/user/nd_decision_tree.md
Quality metrics
- Rank instability: how sensitive the ordering is.
- Distance margin: how separated the top-k neighbors are.
- Similarity entropy: how uniform the scores are.
Witness modes
off: no witness check, metrics reported as not measured.sample: exact verification on a small subset.full: exact verification on full candidates.
Confidence labels
high_confidence: stable ranking and clear margins.medium_confidence: acceptable variance.low_confidence: unstable or low-signal results.
What to do with low confidence
- Reduce
kor increase recall targets. - Enable witness mode.
- Switch to deterministic exact for verification.
Example ND report (simplified)
{
"quality": {
"rank_instability": 0.12,
"distance_margin": 0.45,
"similarity_entropy": 0.62
},
"confidence": "medium_confidence",
"witness": "sample"
}
Common misunderstandings¶
- “Confidence is subjective.” It is derived from explicit metrics.
- “Low confidence means failure.” It means you should verify or adjust.
- “Metrics are optional.” They are required for ND safety.