Six phases. Enforced gates. Session recovery that actually works.
RESEARCH → PLAN → IMPLEMENT → VALIDATE → REVIEW → DELIVER. Every phase logs events, every gate blocks the next, and every checkpoint survives context compaction.
run · phase-gate timeline
RESEARCH
PLAN
IMPLEMENT
VALIDATE
REVIEW
DELIVER
RESEARCH
PLAN
IMPLEMENT
VALIDATE
REVIEW
DELIVER
Every AI agent session is an invisible black box
Without structure, “I'll build this” is untraceable. When context compacts, state evaporates. When the session crashes, the agent restarts from zero, repeating solved problems and re-discovering known gotchas. Every other framework has this problem. TRW solved it.
On crash
Session restarts from zero
Prior work
Every decision in the compacted window is lost
Gotchas
Re-discovered the hard way each time
On crash
pre_compact_state.jsonResume at the exact checkpoint
Prior work
append-onlyevents.jsonl is append-only — nothing silently dropped
Gotchas
trw_session_startTop-N learnings injected before work begins
The 6-phase lifecycle
Each phase has explicit exit criteria. A phase gate blocks the next phase until evidence passes. The orchestrator enforces this — no phase advances on intent alone.
RESEARCH
PLAN
IMPLEMENT
VALIDATE
REVIEW
DELIVER
Recall prior work, inspect codebase, register findings. Exit: plan.md draft with ≥3 evidence paths and formation selected.
Exit
plan.md draft, ≥3 evidence paths
Design approach, identify dependencies, choose formation. Exit: acceptance criteria written, shards planned, wave_manifest.yaml created.
Exit
Acceptance criteria, wave_manifest.yaml
Execute with periodic checkpoints, shard self-review. Exit: shards/waves complete or checkpointed, tests written.
Exit
Shards complete, tests written
Run build_check, verify coverage, integration check. Exit: coverage ≥ target, all gates pass, zero P0 issues.
Exit
Coverage ≥ target, gates pass
Audit diff for quality (DRY/KISS/SOLID), record learnings. Exit: critic reviewed, adversarial audit passed, no P0 findings.
Exit
Audit passed, no P0 findings
Sync artifacts, checkpoint, close run. Exit: PR created or archived, final.md written, CLAUDE.md synced.
Exit
PR created, final.md, CLAUDE.md synced
RESEARCH
01/06Recall prior work, inspect codebase, register findings. Exit: plan.md draft with ≥3 evidence paths and formation selected.
Exit
plan.md draft, ≥3 evidence paths
PLAN
02/06Design approach, identify dependencies, choose formation. Exit: acceptance criteria written, shards planned, wave_manifest.yaml created.
Exit
Acceptance criteria, wave_manifest.yaml
IMPLEMENT
03/06Execute with periodic checkpoints, shard self-review. Exit: shards/waves complete or checkpointed, tests written.
Exit
Shards complete, tests written
VALIDATE
04/06Run build_check, verify coverage, integration check. Exit: coverage ≥ target, all gates pass, zero P0 issues.
Exit
Coverage ≥ target, gates pass
REVIEW
05/06Audit diff for quality (DRY/KISS/SOLID), record learnings. Exit: critic reviewed, adversarial audit passed, no P0 findings.
Exit
Audit passed, no P0 findings
DELIVER
06/06Sync artifacts, checkpoint, close run. Exit: PR created or archived, final.md written, CLAUDE.md synced.
Exit
PR created, final.md, CLAUDE.md synced
The context-compaction problem
AI CLIs compact context when tokens fill. Pre-TRW: the agent loses state, restarts work, repeats solved problems. TRW: a pre-compact.sh hook fires BEFORE compaction, saving { run_path, phase, events_logged, last_checkpoint }. On restart, session-start.sh reads state and resumes from the exact resume point.
- Context fills → agent compacts → state evicted
- On restart: no memory of the active phase
- Agent re-discovers already-solved problems
- Work done may be repeated or silently lost
- pre-compact hook fires before the window fills
- Writes pre_compact_state.json atomically (temp → rename)
- session-start reads state → resumes at the exact phase
- Events log is append-only — no data lost (shared across subagent-start too)
Atomic checkpoints
trw_checkpoint() writes atomic snapshots (temp file → rename) at milestones. JSONL events append-only to events.jsonl. A crash at any point leaves a recoverable state. Delegated runs use agents like trw-researcher and trw-implementer that inherit the same checkpoint contract.
.trw/context/pre_compact_state.json
{
"timestamp": "2026-04-16T03:14:09Z",
"trigger": "mcp_tool",
"run_path": ".trw/runs/my-feature/20260416T031400Z-abc1234",
"phase": "IMPLEMENT",
"events_logged": 47,
"last_checkpoint": "pre-compaction safety checkpoint",
"pending_ceremony": ["trw_review", "trw_deliver"],
"last_5_events": [
{"event": "tool_invocation", "data": {"tool_name": "trw_checkpoint"}},
{"event": "tool_invocation", "data": {"tool_name": "trw_learn"}},
...
]
}Ceremony scoring
Each run scores 0–100. The avg_ceremony_score metric is surfaced on /metrics. Compliance is measured, not aspired to. trw_build_check and trw_instructions_sync feed into the final score.
Pairs with
Verification
Phase 4 (VALIDATE) is the verification gate. The workflow does not advance until build evidence passes.
Memory
The DELIVER phase runs trw_instructions_sync, refreshing the client instruction file so every future session starts with the latest protocol.
Requirements
The PLAN phase produces the PRD. Every subsequent phase verifies implementation against it.
Common questions
Can I skip phases?
What exactly gets checkpointed?
Does this work offline?
How long does a full phase cycle take?
What happens if VALIDATE fails?
How is DELIVER different from a checkpoint?