Six phases. Enforced gates. Session recovery that actually works.
RESEARCH → PLAN → IMPLEMENT → VALIDATE → REVIEW → DELIVER. Every phase logs events, every gate blocks the next, and every checkpoint survives context compaction.
run · phase-gate timeline
RESEARCH
PLAN
IMPLEMENT
VALIDATE
REVIEW
DELIVER
RESEARCH
PLAN
IMPLEMENT
VALIDATE
REVIEW
DELIVER
Every AI agent session is an invisible black box
Without structure, “I'll build this” is untraceable. When context compacts, state evaporates. When the session crashes, the agent restarts from zero, repeating solved problems and re-discovering known gotchas. Every other framework has this problem. TRW solved it.
restart from zero
Sessions without recovery
Every prior decision lost
24
TRW-checkpointed sessions
Recovered from compaction
75+
Learnings persisted
Across all TRW runs
The 6-phase lifecycle
Each phase has explicit exit criteria. A phase gate blocks the next phase until evidence passes. The orchestrator enforces this — no phase advances on intent alone.
RESEARCH
Recall prior work, inspect codebase, register findings. Exit: plan.md draft with ≥3 evidence paths and formation selected.
PLAN
Design approach, identify dependencies, choose formation. Exit: acceptance criteria written, shards planned, wave_manifest.yaml created.
IMPLEMENT
Execute with periodic checkpoints, shard self-review. Exit: shards/waves complete or checkpointed, tests written.
VALIDATE
Run build_check, verify coverage, integration check. Exit: coverage ≥ target, all gates pass, zero P0 issues.
REVIEW
Audit diff for quality (DRY/KISS/SOLID), record learnings. Exit: critic reviewed, adversarial audit passed, no P0 findings.
DELIVER
Sync artifacts, checkpoint, close run. Exit: PR created or archived, final.md written, CLAUDE.md synced.
The context-compaction problem
AI CLIs compact context when tokens fill. Pre-TRW: the agent loses state, restarts work, repeats solved problems. TRW: a pre-compact.sh hook fires BEFORE compaction, saving { run_path, phase, events_logged, last_checkpoint }. On restart, session-start.sh reads state and resumes from the exact resume point.
- Context fills → agent compacts → state evicted
- On restart: no memory of active phase
- Agent re-discovers solved problems
- Work already done may be repeated or lost
- pre-compact hook fires before window fills
- Writes pre_compact_state.json atomically
- session-start reads state → resumes at exact phase
- Events log is append-only — no data lost (shared across subagent-start too)
Atomic checkpoints
trw_checkpoint() writes atomic snapshots (temp file → rename) at milestones. JSONL events append-only to events.jsonl. A crash at any point leaves a recoverable state. Delegated runs use agents like trw-researcher and trw-implementer that inherit the same checkpoint contract.
.trw/context/pre_compact_state.json
{
"timestamp": "2026-04-16T03:14:09Z",
"trigger": "mcp_tool",
"run_path": ".trw/runs/my-feature/20260416T031400Z-abc1234",
"phase": "IMPLEMENT",
"events_logged": 47,
"last_checkpoint": "pre-compaction safety checkpoint",
"pending_ceremony": ["trw_review", "trw_deliver"],
"last_5_events": [
{"event": "tool_invocation", "data": {"tool_name": "trw_checkpoint"}},
{"event": "tool_invocation", "data": {"tool_name": "trw_learn"}},
...
]
}Ceremony scoring
Each run scores 0–100. The avg_ceremony_score metric is surfaced on /metrics. Compliance is measured, not aspired to. trw_build_check and trw_claude_md_sync feed into the final score.
Pairs with
Verification
Phase 4 (VALIDATE) is the verification gate. The workflow does not advance until build evidence passes.
Memory
The DELIVER phase runs trw_claude_md_sync, promoting high-impact learnings into every future session.
Requirements
The PLAN phase produces the PRD. Every subsequent phase verifies implementation against it.
Common questions
Can I skip phases?
What exactly gets checkpointed?
Does this work offline?
How long does a full phase cycle take?
What happens if VALIDATE fails?
How is DELIVER different from a checkpoint?