LEDGER // MEMORY

LOCAL_FIRST20000+ TESTSBSL 1.1 → APACHE 2030

Memory that persists across every session

TRW Memory stores what your agent learns, scores it against real outcomes, and surfaces it proactively. When a new session starts, the system re-injects the most relevant prior learnings — designed so a gotcha solved in session 12 can be recalled before session 50 hits the same problem.

KNOWLEDGE GRAPH

0/11 nodesdepth 0

similarityevidence_forsame_root_causeco_anchoreddepends_on

RECALL RESULTS0/5

> "JWT auth security patterns"

LEDGER // STORAGE

Memory that survives compaction

Learnings are stored in local SQLite with YAML backup — one file per entry, git- friendly, no server required. Data lives at .trw/ inside your repo and travels with it.

Context compaction is the number-one AI memory pain point: the agent forgets everything when the window fills. TRW Memory sidesteps this entirely. trw_session_start() fires before any work begins and re-injects the top-N most relevant learnings, regardless of how many times the context has compacted.

Per-entry YAML (human-readable backup)

id: M-a3f2c1b9
summary: JWT refresh tokens need rotation on every use
tags: [security, auth]
importance: 0.93
q_value: 0.87
created_at: 2026-04-01T09:12:43Z
outcome_history:
  - "tests_passed:delta=+0.08:session=249"

3000+

active learnings in this repo

Local

SQLite plus YAML backups

TRACE // RETRIEVAL

Not just vectors. A scoring engine.

Retrieval combines keyword matching with dense vector similarity, and the fused ranking is adjusted by two forces. Learnings recalled often stay visible; learnings left unused quietly fade. Learnings that preceded a passing build get higher utility; learnings that preceded a failure get penalised.

The recency term is an Ebbinghaus forgetting curve, and the outcome-weighted utility is a Q-learning update. Both run at query time against the stored scores — neither mutates the underlying entries.

UTILITY FORMULA

utility = effective_q × retention
          + access_boost

retention = recurrence_strength
          × exp(−(ln2 / 14d) × days_unused)

combined = 0.6 × relevance
         + 0.4 × utility

Decay is applied at query time, not mutated in storage. Entries fade unless reinforced by recall.

HYBRID RETRIEVAL PIPELINE

BM25 (keyword)

BM25Okapi sparse retrieval — exact match, token overlap

Dense (vector)

Cosine similarity via sqlite-vec — conceptual similarity

RRF fusion (k=60)

Reciprocal Rank Fusion — entries in both lists score higher

Utility re-rank

Combined relevance + utility score applied last

Graceful degradation: if embeddings are unavailable, BM25 runs alone. If BM25 is unavailable, dense runs alone. Neither being unavailable returns an empty list.

LATTICE // GRAPH

A typed knowledge graph, not a flat index

Each learning becomes a node. Relationships between learnings are typed edges. At recall time, BFS traversal from the matched root surfaces related entries even if they share no keywords with the query — exactly what the hero demo is showing.

similarity

Cosine similarity above 0.75 threshold

tag_cooccurrence

Shared 2+ tags — Jaccard-weighted

consolidation

Episodic-to-semantic merge lineage

anchored_to

Entry bound to a specific source file

related_to

General-purpose associative edge

same_root_cause

Two entries trace to one failure mode

depends_on

Entry A is only valid when B holds

produced

Agent action that generated the entry

motivated_by

Entry created in response to another

co_anchored

Both entries anchor to the same file

supersedes

Newer entry replaces or refines an older one

evidence_for

Entry is empirical support for another

conflicts_with

Contradictory entries — lower-importance suppressed at recall

BFS traversal depth is capped at 3. Cross-edge propagation rates vary by type: evidence_for propagates impact at 30%, co_anchored at 20%, same_root_cause at 15%.

TRACE // SESSION_START

Proactive recall, not reactive retrieval

Most memory systems wait to be asked. TRW Memory runs before you ask.

trw_session_start() fires as the absolute first action of every session. It queries the memory store against the current project context, applies the hybrid retrieval + utility scoring pipeline, and injects the top-N most relevant learnings into the agent's context window before any work begins.

When session 50 starts, TRW queries the store for learnings recorded by sessions 1–49 and re-injects the highest-scoring matches before the agent sees its first prompt. Whether this recall translates to measurable outcome improvement is an open empirical question.

Deposits happen through trw_learn(). Ad-hoc lookups happen through trw_recall().

ILLUSTRATIVE TRACESession start (example, not measured telemetry)

01trw_session_start() called

02Load learnings index

03BM25 keyword scan — 31 candidates

04Dense vector search — 28 candidates

05RRF fusion — 44 unique entries ranked

06Utility re-rank applied

07Inject top 8 learnings into context

→"JWT rotation required on every use" (0.91)

→"Env vars must not have prod defaults" (0.94)

→+6 more …

Order is real; timings are omitted deliberately. Run-to-run latency varies by index size and whether embeddings are enabled.

LEDGER // SPEC

Technical specification

StorageSQLite (primary) + YAML backup · WAL mode · sqlite-vec vectors

RetrievalBM25Okapi + dense cosine · RRF k=60 · graceful degradation

ScoringQ-value EMA · Ebbinghaus decay · Bayesian MACLA calibration

Graph13 typed edge types · BFS depth 3 · Jaccard tag co-occurrence

LifecycleHot/warm/cold tiers · semantic dedup (0.85 cosine) · sweep-based purge

SecurityAES-256-GCM · PII detection · z-score poisoning defense · RBAC

DedupCosine similarity 0.85 threshold · merge or skip decision at write

IntegrationsLangChain · LlamaIndex · CrewAI · OpenAI-compatible adapter

Test coverage20000+ tests across all packages · mypy strict clean

LicenseBSL 1.1 · converts to Apache 2.0 on 2030-03-21

LEDGER // CONSTRAINTS

What memory does not do

Accurate scoping saves debugging time later.

No pre-populated graph

The knowledge graph builds entirely from entries you deposit. A fresh install has no edges and no semantic clusters — those emerge over sessions.

Q-values need outcome feedback to matter

The scoring system computes meaningful signal only after you deliver sessions with test and build outcomes. Until then, scores reflect impact estimates, not observed utility.

Embeddings are optional

Dense vector search requires the [embeddings] and [vectors] extras. Without them, retrieval falls back to BM25 keyword matching only — still useful, not identical.

Memory does not self-correct wrong entries

If an agent stores an incorrect learning, it persists until it is explicitly forgotten or superseded. Memory is as reliable as the agents depositing into it.

Not a replacement for checkpoints

Memory stores learnings — patterns, gotchas, decisions. Checkpoints save resumable execution state. They are complementary, not interchangeable.

LATTICE // RELATED

Pairs with

Workflows

Every workflow phase deposits learnings that Memory surfaces on the next run — creating a feedback loop between structured execution and knowledge.

Cross-project sharing

Sharing is Memory applied across project boundaries. High-importance cross-validated entries propagate automatically when sharing is configured.

Verification

Verification outcomes (tests pass, build fails) close the Q-learning loop — those results are the reward signal that calibrates utility scores over time.

LEDGER // FAQ

Common questions

What is the difference between TRW Memory and Claude Code's built-in memory?

Claude Code's built-in memory is a 200-line personal index, search is filename-scan only, and scope is limited to one primary session. TRW Memory uses hybrid BM25 + vector retrieval, is visible to all agents and subagents, scores entries by measured outcomes, and scales across hundreds of sessions without hitting a cap.

Does TRW Memory send code or project data to a server?

No. By default all data stays in SQLite and YAML files inside your repo at `.trw/`. Remote sync is an optional feature you must explicitly configure. The default path has no outbound network calls.

What happens to recalled learnings when context compacts?

Context compaction is exactly what Memory is designed to survive. `trw_session_start()` fires at the top of every session — including sessions that resume after compaction — and re-injects the top-N most relevant learnings before any work begins. The knowledge doesn't live in the context window; it lives in SQLite.

How long before Memory starts being useful in a new project?

Learnings from the very first delivered session are available at session two's start. Most teams notice meaningful recall quality after five to ten sessions in a project. Cross-validated signal (entries confirmed by multiple projects) builds more slowly — typically 20+ delivered sessions across at least two project namespaces.

TERMINAL // INSTALL

Give your agents persistent memory

Install takes under 2 minutes. Memory starts accumulating from session one. The full behaviour, tool signatures, and storage layout are in the memory spec.

TERMINAL~/repo

# install

$ curl -fsSL https://trwframework.com/install.sh | bash