Skip to main content
TRW
LEDGER // MEMORY
LOCAL_FIRST13500+ TESTSBSL 1.1 → APACHE 2030

Memory that compounds across every session

TRW Memory stores what your agent learns, scores it against real outcomes, and surfaces it proactively. The gotcha your agent fixed in session 12 shows up again — already solved — when session 50 starts.

KNOWLEDGE GRAPH
0/11 nodesdepth 0
securitysecuritysecuritysecuritydeploymentsecuritysecuritysecuritydeploymentsecuritysecurity
similarityevidence_forsame_root_causeco_anchoreddepends_on
RECALL RESULTS0/5
> "JWT auth security patterns"
LEDGER // STORAGE

Memory that survives compaction

Learnings are stored in local SQLite with YAML backup — one file per entry, git- friendly, no server required. Data lives at .trw/ inside your repo and travels with it.

Context compaction is the number-one AI memory pain point: the agent forgets everything when the window fills. TRW Memory sidesteps this entirely. trw_session_start() fires before any work begins and re-injects the top-N most relevant learnings, regardless of how many times the context has compacted.

Per-entry YAML (human-readable backup)
id: M-a3f2c1b9
summary: JWT refresh tokens need rotation on every use
tags: [security, auth]
importance: 0.93
q_value: 0.87
created_at: 2026-04-01T09:12:43Z
outcome_history:
  - "tests_passed:delta=+0.08:session=249"

492

learnings in this repo

249

delivered sessions

TRACE // RETRIEVAL

Not just vectors. A scoring engine.

Retrieval combines keyword matching with dense vector similarity, and the fused ranking is adjusted by two forces. Learnings recalled often stay visible; learnings left unused quietly fade. Learnings that preceded a passing build get higher utility; learnings that preceded a failure get penalised.

The recency term is an Ebbinghaus forgetting curve, and the outcome-weighted utility is a Q-learning update. Both run at query time against the stored scores — neither mutates the underlying entries.

UTILITY FORMULA

utility = effective_q × retention
          + access_boost

retention = recurrence_strength
          × exp(−(ln2 / 14d) × days_unused)

combined = 0.6 × relevance
         + 0.4 × utility

Decay is applied at query time, not mutated in storage. Entries fade unless reinforced by recall.

HYBRID RETRIEVAL PIPELINE

1

BM25 (keyword)

BM25Okapi sparse retrieval — exact match, token overlap

2

Dense (vector)

Cosine similarity via sqlite-vec — conceptual similarity

3

RRF fusion (k=60)

Reciprocal Rank Fusion — entries in both lists score higher

4

Utility re-rank

Combined relevance + utility score applied last

Graceful degradation: if embeddings are unavailable, BM25 runs alone. If BM25 is unavailable, dense runs alone. Neither being unavailable returns an empty list.

LATTICE // GRAPH

A typed knowledge graph, not a flat index

Each learning becomes a node. Relationships between learnings are typed edges. At recall time, BFS traversal from the matched root surfaces related entries even if they share no keywords with the query — exactly what the hero demo is showing.

similarity

Cosine similarity above 0.75 threshold

tag_cooccurrence

Shared 2+ tags — Jaccard-weighted

consolidation

Episodic-to-semantic merge lineage

anchored_to

Entry bound to a specific source file

related_to

General-purpose associative edge

same_root_cause

Two entries trace to one failure mode

depends_on

Entry A is only valid when B holds

produced

Agent action that generated the entry

motivated_by

Entry created in response to another

co_anchored

Both entries anchor to the same file

supersedes

Newer entry replaces or refines an older one

evidence_for

Entry is empirical support for another

conflicts_with

Contradictory entries — lower-importance suppressed at recall

BFS traversal depth is capped at 3. Cross-edge propagation rates vary by type: evidence_for propagates impact at 30%, co_anchored at 20%, same_root_cause at 15%.

TRACE // SESSION_START

Proactive recall, not reactive retrieval

Most memory systems wait to be asked. TRW Memory runs before you ask.

trw_session_start() fires as the absolute first action of every session. It queries the memory store against the current project context, applies the hybrid retrieval + utility scoring pipeline, and injects the top-N most relevant learnings into the agent's context window before any work begins.

The agent starts session 50 knowing what it learned in sessions 1–49. It does not need to rediscover solved problems.

Deposits happen through trw_learn(). Ad-hoc lookups happen through trw_recall().

ILLUSTRATIVE TRACESession start (example, not measured telemetry)
01trw_session_start() called
02Load learnings index
03BM25 keyword scan — 31 candidates
04Dense vector search — 28 candidates
05RRF fusion — 44 unique entries ranked
06Utility re-rank applied
07Inject top 8 learnings into context
"JWT rotation required on every use" (0.91)
"Env vars must not have prod defaults" (0.94)
+6 more …

Order is real; timings are omitted deliberately. Run-to-run latency varies by index size and whether embeddings are enabled.

LEDGER // SPEC

Technical specification

StorageSQLite (primary) + YAML backup · WAL mode · sqlite-vec vectors
RetrievalBM25Okapi + dense cosine · RRF k=60 · graceful degradation
ScoringQ-value EMA · Ebbinghaus decay · Bayesian MACLA calibration
Graph13 typed edge types · BFS depth 3 · Jaccard tag co-occurrence
LifecycleHot/warm/cold tiers · semantic dedup (0.85 cosine) · sweep-based purge
SecurityAES-256-GCM · PII detection · z-score poisoning defense · RBAC
DedupCosine similarity 0.85 threshold · merge or skip decision at write
IntegrationsLangChain · LlamaIndex · CrewAI · OpenAI-compatible adapter
Test coverage13500+ tests across all packages · mypy strict clean
LicenseBSL 1.1 · converts to Apache 2.0 on 2030-03-21
LEDGER // CONSTRAINTS

What memory does not do

Accurate scoping saves debugging time later.

No pre-populated graph

The knowledge graph builds entirely from entries you deposit. A fresh install has no edges and no semantic clusters — those emerge over sessions.

Q-values need outcome feedback to matter

The scoring system computes meaningful signal only after you deliver sessions with test and build outcomes. Until then, scores reflect impact estimates, not observed utility.

Embeddings are optional

Dense vector search requires the [embeddings] and [vectors] extras. Without them, retrieval falls back to BM25 keyword matching only — still useful, not identical.

Memory does not self-correct wrong entries

If an agent stores an incorrect learning, it persists until it is explicitly forgotten or superseded. Memory is as reliable as the agents depositing into it.

Not a replacement for checkpoints

Memory stores learnings — patterns, gotchas, decisions. Checkpoints save resumable execution state. They are complementary, not interchangeable.

LEDGER // FAQ

Common questions

What is the difference between TRW Memory and Claude Code's built-in memory?

Claude Code's built-in memory is a 200-line personal index, search is filename-scan only, and scope is limited to one primary session. TRW Memory uses hybrid BM25 + vector retrieval, is visible to all agents and subagents, scores entries by measured outcomes, and scales across hundreds of sessions without hitting a cap.

Does TRW Memory send code or project data to a server?

No. By default all data stays in SQLite and YAML files inside your repo at `.trw/`. Remote sync is an optional feature you must explicitly configure. The default path has no outbound network calls.

What happens to recalled learnings when context compacts?

Context compaction is exactly what Memory is designed to survive. `trw_session_start()` fires at the top of every session — including sessions that resume after compaction — and re-injects the top-N most relevant learnings before any work begins. The knowledge doesn't live in the context window; it lives in SQLite.

How long before Memory starts being useful in a new project?

Learnings from the very first delivered session are available at session two's start. Most teams notice meaningful recall quality after five to ten sessions in a project. Cross-validated signal (entries confirmed by multiple projects) builds more slowly — typically 20+ delivered sessions across at least two project namespaces.
TERMINAL // INSTALL

Give your agents persistent memory

Install takes under 2 minutes. Memory starts accumulating from session one. The full behaviour, tool signatures, and storage layout are in the memory spec.

TERMINAL~/repo

# install

$ curl -fsSL https://trwframework.com/install.sh | bash