C4 — Cross-Session Persistence

Kill the process. Does the memory survive?

Phase 1 writes 20 facts in the parent process. Phase 2 spawns a completely fresh Python subprocess with no shared state — no globals, no cached instances, no open connections. The subprocess must recall all 40 questions using only what was durably persisted.

100%
Iranti
40/40 · PostgreSQL
100%
Shodh
40/40 · Docker volume
75%
Mem0
30/40 · Chroma (disk)
57%
Graphiti
23/40 · Neo4j (Docker)

Subprocess isolation

C1 and C2 run write and recall in the same Python process. Singleton instances, cached SDK clients, and open database connections persist in memory between write and recall. This is realistic for same-session use, but it does not test true durability.

C4 uses subprocess.run([sys.executable, '-c', script]) to spawn a completely independent Python interpreter. All config (API keys, URLs, fact IDs) is baked into the recall script string via string formatting — the subprocess has no access to the parent's memory.

This pattern is the closest available simulation of a restart scenario: the original session ends, a new session begins, and recall must succeed based solely on what was written to durable storage during the first session.

Subprocess pattern

# Phase 1 — parent process writes
iranti_write(fact_id, text)
shodh_write(fact_id, text)
mem0_write(fact_id, text)
await graphiti_write(fact_id, text)
# Phase 2 — fresh subprocess, no shared state
RECALL_SCRIPT = f"""
iranti_url = '{iranti_url}'
# ... all other config baked in
result = iranti_recall(fact_id, query)
"""
subprocess.run(
[sys.executable, '-c', RECALL_SCRIPT],
capture_output=True, text=True
)

Storage layers

SystemStorageSame-sessionCross-sessionDelta
Iranti
PostgreSQL
Managed instance, persistent volume
100%100%
Shodh
Docker volume
File-backed, bind-mounted
100%100%
Mem0
Chroma (disk)
Local persistent directory
80%75%-5pp
Graphiti
Neo4j (Docker)
Graph DB, Docker container
57%57%

Same-session vs cross-session

C1 — Same-session recall
Iranti100%
Shodh100%
Mem080%
Graphiti57%
C4 — Cross-session recall (fresh subprocess)
Iranti100%
Shodh100%
Mem075%5pp
Graphiti57%

What the delta tells us

Iranti + Shodh: zero delta

Both systems maintain 100% recall in the fresh subprocess. Iranti uses PostgreSQL — writes are transactional and immediately consistent. Shodh uses a Docker volume with file-backed storage — writes are flushed before the parent process exits. Neither system relies on in-memory state for recall accuracy.

Mem0: −5pp cross-session

Mem0 drops from 80% (same-session) to 75% (cross-session). Chroma is a disk-backed vector store and is inherently persistent. The 5-point drop likely reflects minor indexing inconsistency across cold starts — the same-session run had a warm collection index while the subprocess opened it cold. Not a fundamental persistence failure, but a meaningful consistency gap under repeated process boundaries.

Graphiti: zero delta

Graphiti's cross-session score (57%) exactly matches its same-session score. Neo4j is a fully durable graph database — there is no persistence regression. The 57% ceiling is entirely explained by entity extraction quality: facts that were not retrievable in-process are also not retrievable cross-process, because the content was lost during ingestion, not during persistence.

What C4 actually tests

C4 separates persistence from retrieval quality. All four systems persist data durably — none rely purely on in-memory state. The cross-session score differences are not evidence of persistence failure; they are evidence of retrieval quality constraints that also manifest in-process. C4 confirms: for Mem0 and Graphiti, the limiting factor is retrieval, not storage.

Key findings

01

All four systems persist across process restart — none rely on in-memory state alone.

02

Iranti (PostgreSQL) and Shodh (Docker volume) are fully consistent: 100% recall in the fresh process.

03

Mem0 drops from 80% (same-session C1) to 75% cross-session — Chroma read variance under repeated parallel queries.

04

Graphiti's cross-session score (57%) mirrors its same-session score — consistent but bounded by entity extraction quality.