Skip to main content
TRW
TRWDocumentation

Requirements engineering

AI agents write code fast. Without structured requirements, they write the wrong code fast. TRW uses AARE-F to turn vague feature requests into machine-verifiable specifications that agents can implement, trace, and validate.

225Total PRDs206Completed75Sprints5Phases

AARE-F framework

AI-Augmented Requirements Engineering Framework, v2.0 — synthesized from 26 waves of systematic research. Ten components across four layers, grounded in five principles. The core insight: AI augments human judgment. It does not replace it.

#PrincipleWhat it means
P1Traceability firstEvery artifact traces to sources and downstream impacts
P2Human-in-the-loopAI accelerates but humans decide — oversight is mandatory
P3Risk-based rigorEffort scales with consequence, not all requirements need equal treatment
P4Semantic understandingEmbeddings replace keywords as the computational substrate
P5Continuous verificationCompliance is engineered in, not audited after

Four-layer architecture

Each layer builds on the one below. Foundation provides the data substrate. Governance controls AI decision-making. Execution coordinates agents. Operations integrates with DevOps.

Foundation
C1 TraceabilityC4 Semantic
Governance
C2 LLM GovC3 RiskC8 Guards
Execution
C5 AgentsC6 UncertaintyC10 Conflicts
Operations
C7 Req-as-CodeC9 Observability
Each layer builds on the one below
Foundation2 components
C1Traceability infrastructure

Bidirectional links, coverage metrics, impact analysis in under 5 seconds

C4Semantic infrastructure

Hybrid retrieval (70% semantic + 30% keyword), dedup at 0.85 cosine, novelty detection

Governance3 components
C2LLM governance

Confidence routing — automate above 95%, flag 85-95%, require review 70-85%, reject below 70%

C3Risk scaling

Critical gets formal spec and multi-party review. Low gets lightweight notes.

C8Guardrails and safety

Input sanitization, output filtering, tool allowlists, token budgets, OWASP LLM Top 10

Execution3 components
C5Multi-agent orchestration

Supervisor pattern — voting (+13% accuracy), consensus, majority protocols

C6Uncertainty management

Tiered detection — entropy, cross-validation, human review. 67% hallucination reduction.

C10Conflict resolution

Rule-based NLP for intra-domain, MCDM for inter-domain, semantic comparison cross-framework

Operations2 components
C7Requirements-as-code

Git-versioned YAML/Markdown, PR-based review, schema validation in CI

C9Continuous observability

MELT model — metrics, events, logs, traces with alert thresholds

PRD system

Every feature starts as a PRD. Each has 12 mandatory sections, EARS-compliant requirements with confidence scores, and Given/When/Then acceptance criteria. Format: PRD-CORE-086, PRD-QUAL-016, PRD-FIX-035.

D
Draft
G
Groomed
R
Reviewed
S
Sprint-ready
IP
In-progress
Done

Stage by stage

Draft

Created from a feature description with 12 mandatory sections

trw_prd_create

Groomed

Iterated to 85%+ quality with traceability matrix and EARS requirements

/prd-groom

Reviewed

Independent quality review with READY / NEEDS WORK verdict

/prd-review

Sprint-ready

Execution plan generated — micro-tasks, wave dependencies, file ownership

/prd-ready

In-progress

Assigned to a sprint with agents implementing against each FR

/trw-sprint-init

Done

All FRs verified, build passes, delivery ceremony complete

trw_deliver

Quality gates

PRDs pass automated validation before entering a sprint. Four dimensions are scored. Fall below any threshold and the PRD is blocked until fixed.

DimensionThresholdHow it's measured
Ambiguity< 5%Vague terms detected — "TBD", "maybe", "could", "should consider"
Completeness> 85%All 12 mandatory sections populated with substantive content
Traceability> 90%Each FR linked to source files and test files via backtick references
Content density> 0.25Ratio of substantive lines to total lines — no filler, no boilerplate
grooming workflow
# One command creates, grooms, reviews, and plans
/prd-ready "Add rate limiting to the API"
# → PRD-CORE-088 created (score: 62/100)
# → Groom pass 1: 62 → 78 (filled sections, added EARS patterns)
# → Groom pass 2: 78 → 86 (traceability matrix, density)
# → Review: READY (7 P2 suggestions, 0 blockers)
# → Execution plan: 3 waves, 24 tasks, file ownership assigned

Traceability

Every requirement links forward to code and backward to rationale. The traceability checker agent verifies these links at VALIDATE and DELIVER — unlinked FRs block delivery.

traceability chain
PRD-CORE-086                    (requirement)
  └── FR01: Assertion model
       ├── trw-memory/models/memory.py:45   (source)
       ├── trw-memory/lifecycle/verify.py   (source)
       └── tests/test_assertions.py:12      (test)

Target: >= 90% of FRs linked to both source and tests
Impact analysis: < 5 seconds per change

How it works

FRs reference source files with backtick-wrapped paths: `src/auth.py:42`. The traceability checker parses these, verifies the files exist, and scores coverage. Missing links block delivery.

Sprint execution

Sprints decompose PRDs into waves — groups of tasks with explicit dependency ordering. Each wave gets file ownership to prevent merge conflicts when agents work in parallel.

StepWhat happensTool
1.InitializeSelect PRDs, generate wave plan, assign file ownership/trw-sprint-init
2.PlanDecompose FRs into micro-tasks with dependency graphs/trw-exec-plan
3.ImplementAgents work waves sequentially, checkpoint after eachtrw_checkpoint
4.ValidateBuild gate — tests pass, type-check clean, coverage mettrw_build_check
5.ReviewAdversarial spec-vs-code audit by independent agent/trw-audit
6.DeliverPersist learnings, close run, sync CLAUDE.mdtrw_deliver

Active sprint

Sprint 75: Executable Assertions — machine-verifiable grep/glob assertions on learnings. 4 waves, 49 tasks, PRD-CORE-086. Knowledge self-validates against the codebase.

Executable assertions

Learnings and PRD FRs carry grep/glob assertions verified against the codebase automatically. If the code changes and an assertion fails, the learning is flagged as stale. Knowledge stays honest as the codebase evolves.

assertion example
trw_learn(
  summary="SQLite WAL mode required for concurrent reads",
  detail="Without WAL, concurrent read queries block on writes...",
  assertions=[{
    "type": "grep",
    "pattern": "journal_mode.*wal",
    "glob": "**/*.py",
    "must_match": true
  }]
)
# → Learning recorded with 1 assertion
# → Assertion verified: PASS (matched in storage/sqlite.py:34)

Why this matters

Traditional learnings are passive text — they claim things but never prove them. Executable assertions close the loop. Every claim is verified on every recall.

Tools and skills

trw_prd_create

Generate an AARE-F-compliant PRD from a feature description

MCP tool
trw_prd_validate

Score a PRD across 4 quality dimensions with pass/fail gate

MCP tool
/prd-new

Full lifecycle in one command: create, groom, review, execution plan

Skill
/prd-groom

Iterate a PRD to sprint-ready quality (score >= 85%)

Skill
/prd-review

Independent review returning READY or NEEDS WORK with per-dimension scores

Skill
/trw-audit

Adversarial spec-vs-code verification — finds gaps implementation missed

Skill

Next steps