Skip to main content
TRW
LATTICE // WORKFLOWS

Six phases. Enforced gates. Session recovery that actually works.

RESEARCH → PLAN → IMPLEMENT → VALIDATE → REVIEW → DELIVER. Every phase logs events, every gate blocks the next, and every checkpoint survives context compaction.

run · phase-gate timeline

RUNNING

RESEARCH

PLAN

IMPLEMENT

VALIDATE

REVIEW

DELIVER

LEDGER // PROBLEM

Every AI agent session is an invisible black box

Without structure, “I'll build this” is untraceable. When context compacts, state evaporates. When the session crashes, the agent restarts from zero, repeating solved problems and re-discovering known gotchas. Every other framework has this problem. TRW solved it.

restart from zero

Sessions without recovery

Every prior decision lost

24

TRW-checkpointed sessions

Recovered from compaction

75+

Learnings persisted

Across all TRW runs

LATTICE // PHASES

The 6-phase lifecycle

Each phase has explicit exit criteria. A phase gate blocks the next phase until evidence passes. The orchestrator enforces this — no phase advances on intent alone.

RESEARCH

Recall prior work, inspect codebase, register findings. Exit: plan.md draft with ≥3 evidence paths and formation selected.

PLAN

Design approach, identify dependencies, choose formation. Exit: acceptance criteria written, shards planned, wave_manifest.yaml created.

IMPLEMENT

Execute with periodic checkpoints, shard self-review. Exit: shards/waves complete or checkpointed, tests written.

VALIDATE

Run build_check, verify coverage, integration check. Exit: coverage ≥ target, all gates pass, zero P0 issues.

REVIEW

Audit diff for quality (DRY/KISS/SOLID), record learnings. Exit: critic reviewed, adversarial audit passed, no P0 findings.

DELIVER

Sync artifacts, checkpoint, close run. Exit: PR created or archived, final.md written, CLAUDE.md synced.

TRACE // COMPACTION

The context-compaction problem

AI CLIs compact context when tokens fill. Pre-TRW: the agent loses state, restarts work, repeats solved problems. TRW: a pre-compact.sh hook fires BEFORE compaction, saving { run_path, phase, events_logged, last_checkpoint }. On restart, session-start.sh reads state and resumes from the exact resume point.

WITHOUT TRW
  • Context fills → agent compacts → state evicted
  • On restart: no memory of active phase
  • Agent re-discovers solved problems
  • Work already done may be repeated or lost
WITH TRW
  • pre-compact hook fires before window fills
  • Writes pre_compact_state.json atomically
  • session-start reads state → resumes at exact phase
  • Events log is append-only — no data lost (shared across subagent-start too)
LEDGER // CHECKPOINTS

Atomic checkpoints

trw_checkpoint() writes atomic snapshots (temp file → rename) at milestones. JSONL events append-only to events.jsonl. A crash at any point leaves a recoverable state. Delegated runs use agents like trw-researcher and trw-implementer that inherit the same checkpoint contract.

.trw/context/pre_compact_state.json

{
  "timestamp": "2026-04-16T03:14:09Z",
  "trigger": "mcp_tool",
  "run_path": ".trw/runs/my-feature/20260416T031400Z-abc1234",
  "phase": "IMPLEMENT",
  "events_logged": 47,
  "last_checkpoint": "pre-compaction safety checkpoint",
  "pending_ceremony": ["trw_review", "trw_deliver"],
  "last_5_events": [
    {"event": "tool_invocation", "data": {"tool_name": "trw_checkpoint"}},
    {"event": "tool_invocation", "data": {"tool_name": "trw_learn"}},
    ...
  ]
}
LEDGER // SCORING

Ceremony scoring

Each run scores 0–100. The avg_ceremony_score metric is surfaced on /metrics. Compliance is measured, not aspired to. trw_build_check and trw_claude_md_sync feed into the final score.

SESSION_STARTCalled as first action
30 pts
REFLECTION_COMPLETELearnings recorded
30 pts
CHECKPOINT≥1 trw_checkpoint() call
20 pts
LEARNING≥1 trw_learn() call
10 pts
BUILD_CHECKtrw_build_check ran and passed
10 pts
PERFECT_RUNAll five criteria met
100 pts
LEDGER // FAQ

Common questions

Can I skip phases?

For MINIMAL-tier runs only. Standard and full runs enforce the complete cycle. Most runs benefit from the full sequence because the exit criteria at each gate catch issues cheaply.

What exactly gets checkpointed?

Run state including the active phase, events logged so far, PRD scope, file ownership map, and the last five events. This is written atomically (temp file → rename) so a crash at any moment leaves a recoverable state.

Does this work offline?

Yes. Run directories are fully local — .trw/runs/{task}/{run_id}/. No cloud connectivity is required. The pre_compact_state.json hook writes to disk before compaction fires, regardless of network.

How long does a full phase cycle take?

Highly task-dependent. A small fix can cycle through in a single session; a full feature with multi-agent formations may span several sessions. The point is not speed — it is that each phase has evidence and the next session resumes exactly where the previous one stopped.

What happens if VALIDATE fails?

The phase gate blocks advancement. The orchestrator routes back to IMPLEMENT with the failing evidence recorded in events.jsonl. The run does not progress to REVIEW until build_check, coverage, and integration checks all pass.

How is DELIVER different from a checkpoint?

Checkpoints are internal save-points that let a run resume mid-phase. DELIVER is the terminal phase: it syncs artifacts (PR, final.md, CLAUDE.md), promotes high-impact learnings, and closes the run. A run can have many checkpoints but only one DELIVER.
TERMINAL // RUN_STRUCTURED

Turn every agent session into a traceable engineering run

24 sprints. 75+ learnings. Every one survived context compaction because they ran on TRW.