Kill the process. Does the memory survive?
Phase 1 writes 20 facts in the parent process. Phase 2 spawns a completely fresh Python subprocess with no shared state — no globals, no cached instances, no open connections. The subprocess must recall all 40 questions using only what was durably persisted.
Subprocess isolation
C1 and C2 run write and recall in the same Python process. Singleton instances, cached SDK clients, and open database connections persist in memory between write and recall. This is realistic for same-session use, but it does not test true durability.
C4 uses subprocess.run([sys.executable, '-c', script]) to spawn a completely independent Python interpreter. All config (API keys, URLs, fact IDs) is baked into the recall script string via string formatting — the subprocess has no access to the parent's memory.
This pattern is the closest available simulation of a restart scenario: the original session ends, a new session begins, and recall must succeed based solely on what was written to durable storage during the first session.
Subprocess pattern
Storage layers
| System | Storage | Same-session | Cross-session | Delta |
|---|---|---|---|---|
| Iranti | PostgreSQL Managed instance, persistent volume | 100% | 100% | — |
| Shodh | Docker volume File-backed, bind-mounted | 100% | 100% | — |
| Mem0 | Chroma (disk) Local persistent directory | 80% | 75% | -5pp |
| Graphiti | Neo4j (Docker) Graph DB, Docker container | 57% | 57% | — |
Same-session vs cross-session
What the delta tells us
Both systems maintain 100% recall in the fresh subprocess. Iranti uses PostgreSQL — writes are transactional and immediately consistent. Shodh uses a Docker volume with file-backed storage — writes are flushed before the parent process exits. Neither system relies on in-memory state for recall accuracy.
Mem0 drops from 80% (same-session) to 75% (cross-session). Chroma is a disk-backed vector store and is inherently persistent. The 5-point drop likely reflects minor indexing inconsistency across cold starts — the same-session run had a warm collection index while the subprocess opened it cold. Not a fundamental persistence failure, but a meaningful consistency gap under repeated process boundaries.
Graphiti's cross-session score (57%) exactly matches its same-session score. Neo4j is a fully durable graph database — there is no persistence regression. The 57% ceiling is entirely explained by entity extraction quality: facts that were not retrievable in-process are also not retrievable cross-process, because the content was lost during ingestion, not during persistence.
C4 separates persistence from retrieval quality. All four systems persist data durably — none rely purely on in-memory state. The cross-session score differences are not evidence of persistence failure; they are evidence of retrieval quality constraints that also manifest in-process. C4 confirms: for Mem0 and Graphiti, the limiting factor is retrieval, not storage.
Key findings
All four systems persist across process restart — none rely on in-memory state alone.
Iranti (PostgreSQL) and Shodh (Docker volume) are fully consistent: 100% recall in the fresh process.
Mem0 drops from 80% (same-session C1) to 75% cross-session — Chroma read variance under repeated parallel queries.
Graphiti's cross-session score (57%) mirrors its same-session score — consistent but bounded by entity extraction quality.