Knowledge Currency
Updating a fact is harder than writing one.
Long-lived agents depend on KB facts staying accurate as the world changes. B5 tests whether Iranti supports updating facts that already exist. The finding: update behavior is complex and depends on source reliability, not just confidence scores.
Results at a glance
What this measures
A KB that only supports writing new facts is useful for accumulation but fragile for long-lived agents. The world changes. Decisions get revised. Facts go stale. An agent system that cannot update its own knowledge will eventually act on wrong information with high confidence.
B5 probes Iranti's update semantics directly: can a higher-confidence or more current value replace an existing fact? The answer depends on whether the conflict resolves deterministically (gap ≥10 points in weighted score) or routes to LLM arbitration. When the gap is large enough, the write succeeds regardless of source history. When it is not, LLM arbitration fires — and under a real provider, the DB transaction times out before the result can persist.
T2 and T3 produce correct rejections: a lower-confidence update should not displace a high-confidence fact, and a duplicate value with lower score should be deduped. T1b and T5 confirm that deterministic resolution works for both same-source and cross-source writes when the score gap is sufficient. T1 and T4 expose the transaction timeout defect on the LLM-arbitrated path.
The six test cases
Each case attempts to update an existing KB fact with a new value. Teal = accepted or correctly rejected. Amber = error (transaction timeout). Gap = weighted score delta between update and existing value.
| Case | Description | Gap | Outcome |
|---|---|---|---|
| T1 | New source, higher raw conf (92 vs 85), gap 2.9 pts LLM arbitration — DB transaction timed out (~10s API, 5s window) | 2.9 pts | ERROR |
| T1b | Same source, higher conf (97 vs 85), gap 10.4 pts Deterministic resolution — gap exceeded threshold | 10.4 pts | ACCEPTED |
| T2 | Lower-confidence update to high-confidence fact Correct behavior — lower confidence lost | negative | REJECTED |
| T3 | Same value, lower confidence (duplicate detection) Duplicate value with lower score — deduplicated | same value | REJECTED |
| T4 | New source, small confidence increase (80 → 85), gap 4.25 pts LLM arbitration — DB transaction timed out (~16s API, 5s window) | 4.25 pts | ERROR |
| T5 | New source, forced high confidence (70 → 99), gap 24.6 pts Deterministic resolution — large gap bypasses LLM arbitration entirely | 24.6 pts | ACCEPTED |
| Teal = accepted or correctly rejected. Amber = error (LLM arbitration timed out; incumbent preserved by rollback). | |||
Confidence gap vs. outcome
Each bar shows the weighted score gap between the incoming update and the existing fact. The dashed vertical line marks the ~10-point threshold above which Iranti resolves conflicts deterministically — bypassing LLM arbitration entirely. Below that line, arbitration runs and the DB transaction times out under a real provider.
Bar length = weighted score gap between update and existing fact. Teal = accepted or correctly rejected. Amber = LLM arbitration triggered a timeout. The vertical dashed line marks the 10-point threshold above which resolution is deterministic. T1b and T5 both clear this threshold and both were accepted. T1 and T4 fall below it and both errored.
Why source reliability matters
Iranti tracks source reliability as an accumulated signal: the more facts a source has written that were accepted and stable, the higher its reliability score. This is a sound design for preventing noisy or adversarial sources from overwriting high-quality facts.
The problem emerges when a new, correct source attempts to update a fact originally written by an established source. The new source has no accumulated reliability — even if its confidence on this specific fact is higher. When the score gap is below the deterministic threshold, LLM arbitration is triggered. Under a real LLM provider, the API latency (8–16 s observed) exceeds the 5,000 ms DB transaction window, causing a transaction timeout before the result can persist. The incumbent is preserved by rollback.
T5 (new) shows the workaround: with a confidence gap large enough to score 24.6 points above the existing fact, deterministic resolution fires without any LLM call — and the update succeeds even across different sources. The reliable update path is a large gap, not a same-source write.
- →Every accepted write increases the writing source's reliability score.
- →A new source starts with no history — it has no reliability advantage even if its value is better.
- →Deterministic resolution (gap > ~10 pts) bypasses this bias and lets a sufficiently superior value win regardless of source history.
- →Below the threshold, LLM arbitration runs — and under a real provider, the DB transaction times out before the result persists. Incumbent is preserved by rollback. The write silently fails from the caller's perspective.
The stale fact problem
This is not a bug in the traditional sense — the conflict detection worked as designed. But the design has a gap: it treats an update rejection as “fact unchanged” rather than “fact challenged.” For long-lived agents, that distinction matters. A fact that was challenged and survived arbitration is epistemically different from a fact that was never challenged.
Until a flagging or versioning mechanism exists, teams using Iranti for knowledge that changes over time should periodically rewrite facts from the original source with an elevated confidence score to force deterministic resolution, or use the same source identifier for all updates to avoid the established-source bias.
Honest limitations
Key findings
Full trial execution records, conflict resolution logs, and methodology notes in the benchmarking repository.