Benchmark B8

Multi-Agent Coordination
Two isolated agents. One shared KB. No messages.

Two agents coordinate without exchanging a single message. Agent Alpha writes structured decisions to a shared Iranti KB; Agent Beta retrieves them cold — no shared context, no direct link. This is the blackboard model for multi-agent systems.

Executed 2026-03-216 decisions · 2 agentsn=6, single session

Results at a glance

6/6Decisions retrieved by Beta with exact fidelity
0%Baseline — isolated agents without shared store (definitional)
3Storage properties automatically preserved (source, validFrom, fidelity)
FindingIranti served as a typed, asynchronous coordination channel between two isolated agents. Source attribution and timestamps were preserved by the storage layer without any application-level tagging.

What this measures

Multi-agent systems almost always coordinate through direct messaging: one agent calls another, passes a payload, waits for a response. This works at small scale. At larger scale it creates tight coupling — every agent needs to know every other agent's address, API shape, and availability.

The blackboard model breaks that coupling. Agents write to a shared knowledge store as a side-effect of their own work. Other agents read from that store when they need information. No agent has to know about any other agent — only about the KB schema.

B8 tests whether Iranti's KB can serve that coordination role reliably. Agent Alpha writes six structured decisions about a project called Blackbox. Agent Beta starts cold — no shared context — and retrieves all six. The KB is the only bridge between them.

The 0% baseline is definitional: two agents with no shared store cannot share information by construction. This is not an empirical measurement of a competing system.

The blackboard architecture

Alpha writes six decisions via iranti_write. Beta retrieves them cold via iranti_query. No message passes between agents — the KB is the only shared surface.

Agent Alphairanti_write ×6Iranti KBshared storeAgent Betairanti_query ×6writeread (cold)no direct message passing between agents

The evidence — 6-fact fidelity grid

Six project decisions written by Agent Alpha and retrieved by Agent Beta. Every value matches exactly. Amber = what Alpha wrote. Teal = what Beta retrieved.

KeyAlpha wroteBeta retrievedMatch
project/blackbox/architecture_decision"microservices with event sourcing""microservices with event sourcing"
project/blackbox/primary_language"Go""Go"
project/blackbox/deployment_target"Kubernetes on GCP""Kubernetes on GCP"
project/blackbox/team_size44
project/blackbox/estimated_completion"Q3 2026""Q3 2026"
project/blackbox/risk_level"medium — third-party API dependency""medium — third-party API dependency"
Total6 written6 retrieved6/6

All keys are under the project/blackbox/ namespace. Beta performed a cold query with no in-context knowledge of what Alpha had written.

Properties confirmed

Three storage-layer guarantees were observed on every retrieved fact, automatically — no application code required.

Exact value fidelity
All 6 values match character-for-character
Source attribution auto-preserved
source=agent_alpha stored and returned without manual tagging
Timestamp available (validFrom)
Beta can determine decision recency from validFrom on every retrieved fact

Honest limitations

LimitationSimulated, not truly isolated. Agent Alpha and Agent Beta were simulated within a single session. True multi-session, multi-process agent isolation — where Alpha runs in one process and Beta runs in a separate process with no shared memory — was not tested in this benchmark. That is the most meaningful real-world isolation condition and it remains unverified here.
LimitationSmall test set (n=6). Six decisions is enough to confirm the coordination channel works. It is not enough to characterize reliability at volume, under concurrent writes, or across large decision graphs.
LimitationSource label, not session identity. source=agent_alpha is a label applied at write time, not a verified session identity. Both Alpha and Beta show as the same benchmark_program_main in iranti_who_knows. The attribution is useful but not cryptographically verified.
NoteDefinitional baseline. The 0%/100% differential is not an A/B measurement — it is definitional. Two isolated agents with no shared store cannot share information. This benchmark proves Iranti enables the channel; it does not compare Iranti against a competing coordination mechanism.

Untested: conflict behavior

This benchmark did not test what happens when both agents write to the same key. That is the most significant gap for real multi-agent use.

In a production blackboard system, multiple agents may legitimately produce conflicting assessments of the same fact — different risk levels, different architecture preferences, different timeline estimates. How Iranti handles concurrent writes to the same key (last-write-wins, versioning, contested-fact flagging, or arbitration) determines whether it can serve as a reliable coordination substrate when agents genuinely disagree.

Gap — not tested in B8
  • Alpha and Beta both write to project/blackbox/risk_level — which value wins?
  • Is the conflict surfaced as a contested fact, silently overwritten, or both versions retained?
  • Can a third agent read both versions and reason about the disagreement?

This gap will be addressed in a future benchmark. Until then, B8's result should be read as: Iranti can serve as a one-writer / many-readers coordination channel. Concurrent-write semantics are unknown from this data.

Key findings

FindingAsynchronous coordination channel works. Alpha wrote; Beta retrieved with 100% fidelity. No synchrony or direct communication was required between agents.
FindingSource attribution is automatic. source=agent_alpha was preserved end-to-end by the storage layer. Application code did not need to thread attribution manually through retrieval.
FindingDecision recency is queryable. validFrom timestamps were available on all six retrieved facts. A downstream agent can determine how old a decision is and weight it accordingly.
FindingTyped coordination, not free-form messages. Because facts are stored under namespaced keys with typed values, Beta retrieved structured data — not a block of unstructured text to parse. The KB acts as a schema boundary between agents.

v0.2.16 Update: True AgentId Attribution Confirmed

The original B8 test used source=agent_alpha — a text label attached at write time, not a verified identity. In v0.2.16 we re-ran using the correct mechanism: the agent parameter that tells Iranti which agent identity is issuing the write. With that in place, iranti_who_knows returns b8_agent_alpha — not the session-level identity of whoever ran the benchmark.

FindingTrue agentId attribution works. When you use the agent parameter correctly, iranti_who_knows returns the actual agent identity ( b8_agent_alpha) rather than the session default. Attribution tracks which logical agent wrote each fact, not which process submitted it.
NoteThe KB is globally shared — no per-agent barriers. Agent Alpha can see what Agent Beta wrote, and vice versa. This is by design; the coordination pattern depends on it. If you need to keep agents' working notes private from each other, enforce that through naming conventions — not through any built-in agentId isolation mechanism, because none exists.
LimitationEntity name normalization. Entity names with forward slashes get converted to underscores when stored. This is undocumented behavior. If you generate entity names programmatically, it's possible to create two names that look different but map to the same stored entry. Worth knowing if you're using path-style entity IDs.
Raw data

Full trial execution records, agent logs, decision payloads, and methodology notes in the benchmarking repository.