Requirements engineering
AI agents write code fast. Without structured requirements, they write the wrong code fast. TRW uses AARE-F to turn vague feature requests into machine-verifiable specifications that agents can implement, trace, and validate.
AARE-F framework
AI-Augmented Requirements Engineering Framework, v2.0 — synthesized from 26 waves of systematic research. Ten components across four layers, grounded in five principles. The core insight: AI augments human judgment. It does not replace it.
| # | Principle | What it means |
|---|---|---|
| P1 | Traceability first | Every artifact traces to sources and downstream impacts |
| P2 | Human-in-the-loop | AI accelerates but humans decide — oversight is mandatory |
| P3 | Risk-based rigor | Effort scales with consequence, not all requirements need equal treatment |
| P4 | Semantic understanding | Embeddings replace keywords as the computational substrate |
| P5 | Continuous verification | Compliance is engineered in, not audited after |
Four-layer architecture
Each layer builds on the one below. Foundation provides the data substrate. Governance controls AI decision-making. Execution coordinates agents. Operations integrates with DevOps.
Foundation2 components
Bidirectional links, coverage metrics, impact analysis in under 5 seconds
Hybrid retrieval (70% semantic + 30% keyword), dedup at 0.85 cosine, novelty detection
Governance3 components
Confidence routing — automate above 95%, flag 85-95%, require review 70-85%, reject below 70%
Critical gets formal spec and multi-party review. Low gets lightweight notes.
Input sanitization, output filtering, tool allowlists, token budgets, OWASP LLM Top 10
Execution3 components
Supervisor pattern — voting (+13% accuracy), consensus, majority protocols
Tiered detection — entropy, cross-validation, human review. 67% hallucination reduction.
Rule-based NLP for intra-domain, MCDM for inter-domain, semantic comparison cross-framework
Operations2 components
Git-versioned YAML/Markdown, PR-based review, schema validation in CI
MELT model — metrics, events, logs, traces with alert thresholds
PRD system
Every feature starts as a PRD. Each has 12 mandatory sections, EARS-compliant requirements with confidence scores, and Given/When/Then acceptance criteria. Format: PRD-CORE-086, PRD-QUAL-016, PRD-FIX-035.
Stage by stage
Draft
Created from a feature description with 12 mandatory sections
trw_prd_createGroomed
Iterated to 85%+ quality with traceability matrix and EARS requirements
/prd-groomReviewed
Independent quality review with READY / NEEDS WORK verdict
/prd-reviewSprint-ready
Execution plan generated — micro-tasks, wave dependencies, file ownership
/prd-readyIn-progress
Assigned to a sprint with agents implementing against each FR
/trw-sprint-initDone
All FRs verified, build passes, delivery ceremony complete
trw_deliverQuality gates
PRDs pass automated validation before entering a sprint. Four dimensions are scored. Fall below any threshold and the PRD is blocked until fixed.
| Dimension | Threshold | How it's measured |
|---|---|---|
| Ambiguity | < 5% | Vague terms detected — "TBD", "maybe", "could", "should consider" |
| Completeness | > 85% | All 12 mandatory sections populated with substantive content |
| Traceability | > 90% | Each FR linked to source files and test files via backtick references |
| Content density | > 0.25 | Ratio of substantive lines to total lines — no filler, no boilerplate |
# One command creates, grooms, reviews, and plans
/prd-ready "Add rate limiting to the API"
# → PRD-CORE-088 created (score: 62/100)
# → Groom pass 1: 62 → 78 (filled sections, added EARS patterns)
# → Groom pass 2: 78 → 86 (traceability matrix, density)
# → Review: READY (7 P2 suggestions, 0 blockers)
# → Execution plan: 3 waves, 24 tasks, file ownership assignedTraceability
Every requirement links forward to code and backward to rationale. The traceability checker agent verifies these links at VALIDATE and DELIVER — unlinked FRs block delivery.
PRD-CORE-086 (requirement)
└── FR01: Assertion model
├── trw-memory/models/memory.py:45 (source)
├── trw-memory/lifecycle/verify.py (source)
└── tests/test_assertions.py:12 (test)
Target: >= 90% of FRs linked to both source and tests
Impact analysis: < 5 seconds per changeHow it works
FRs reference source files with backtick-wrapped paths: `src/auth.py:42`. The traceability checker parses these, verifies the files exist, and scores coverage. Missing links block delivery.
Sprint execution
Sprints decompose PRDs into waves — groups of tasks with explicit dependency ordering. Each wave gets file ownership to prevent merge conflicts when agents work in parallel.
| Step | What happens | Tool |
|---|---|---|
| 1.Initialize | Select PRDs, generate wave plan, assign file ownership | /trw-sprint-init |
| 2.Plan | Decompose FRs into micro-tasks with dependency graphs | /trw-exec-plan |
| 3.Implement | Agents work waves sequentially, checkpoint after each | trw_checkpoint |
| 4.Validate | Build gate — tests pass, type-check clean, coverage met | trw_build_check |
| 5.Review | Adversarial spec-vs-code audit by independent agent | /trw-audit |
| 6.Deliver | Persist learnings, close run, sync CLAUDE.md | trw_deliver |
Active sprint
Sprint 75: Executable Assertions — machine-verifiable grep/glob assertions on learnings. 4 waves, 49 tasks, PRD-CORE-086. Knowledge self-validates against the codebase.
Executable assertions
Learnings and PRD FRs carry grep/glob assertions verified against the codebase automatically. If the code changes and an assertion fails, the learning is flagged as stale. Knowledge stays honest as the codebase evolves.
trw_learn(
summary="SQLite WAL mode required for concurrent reads",
detail="Without WAL, concurrent read queries block on writes...",
assertions=[{
"type": "grep",
"pattern": "journal_mode.*wal",
"glob": "**/*.py",
"must_match": true
}]
)
# → Learning recorded with 1 assertion
# → Assertion verified: PASS (matched in storage/sqlite.py:34)Why this matters
Traditional learnings are passive text — they claim things but never prove them. Executable assertions close the loop. Every claim is verified on every recall.
Tools and skills
trw_prd_createGenerate an AARE-F-compliant PRD from a feature description
MCP tooltrw_prd_validateScore a PRD across 4 quality dimensions with pass/fail gate
MCP tool/prd-newFull lifecycle in one command: create, groom, review, execution plan
Skill/prd-groomIterate a PRD to sprint-ready quality (score >= 85%)
Skill/prd-reviewIndependent review returning READY or NEEDS WORK with per-dimension scores
Skill/trw-auditAdversarial spec-vs-code verification — finds gaps implementation missed
Skill