Specflow Workflow
What This Is
These agents make Specflow work with Claude Code as the orchestrator. They ensure your GitHub issues have ARCH, FEAT, and JOURNEY contracts that can be executed:
- At build time — Jest pattern tests catch architectural violations (
npm test -- contracts) - Post-build — Playwright tests verify user journeys work end-to-end
This three-layer approach reduces architectural drift and ensures work meets Definition of Done.
Two execution modes:
- Subagent mode (default) — Claude Code’s Task tool spawns one-shot agents that do work and return. Works everywhere.
- Agent Teams mode (Claude Code 4.6+) — Persistent peer-to-peer teammates coordinate via TeammateTool API with three-tier journey gates. Set
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=trueto enable.
ARCH contracts → Build fails if: forbidden patterns in code
FEAT contracts → Build fails if: required patterns missing
JOURNEY contracts → Release blocked if: user flows don't work
How You Use It
You give Claude Code a high-level goal. It figures out which agents to call.
High-level prompt:
"Create tasks to make sure all stories in TODO status are specflow-compliant,
that all contracts are created, and every UI story has a journey contract
with Playwright tests."
Claude Code will:
- Run
board-auditorto check compliance - Run
specflow-uplifteron non-compliant issues - Run
contract-generatorto create YAML contracts - Run
contract-test-generatorto create Jest tests - Run
journey-enforcerto check journey coverage - Run
playwright-from-specflowandjourney-testerto create Playwright tests
You don’t need to name agents explicitly. Just describe what you want. The agents are documented so Claude Code knows when to use each one.
But you can be specific if you prefer:
"Run board-auditor on issues #42-#48"
"Run journey-tester for J-AUTH-LOGIN"
Both approaches work. High-level prompts for convenience; specific commands for control.
Example Session
Morning: Prep the Sprint
YOU: "Run board-auditor on all open issues labeled 'feature'"
CLAUDE: [Spawns board-auditor agent via Task tool]
AGENT RETURNS:
Compliance Matrix:
#42 User Profile Ghk=Y SQL=Y RLS=N TSi=Y Jrn=N → Partial
#43 Settings Page Ghk=Y SQL=Y RLS=Y TSi=Y Jrn=N → Partial
#44 Password Reset Ghk=N SQL=Y RLS=Y TSi=N Jrn=N → Partial
3 issues need work before sprint.
YOU: "Run specflow-uplifter on #42, #43, #44 — add missing sections"
CLAUDE: [Spawns 3 specflow-uplifter agents in parallel]
AGENTS RETURN:
#42: Added RLS policy, added J-USR-PROFILE journey reference
#43: Added J-USR-SETTINGS journey reference
#44: Added Gherkin scenarios, TypeScript interface, journey reference
Mid-Morning: Generate Contracts and Tests
YOU: "Generate all contracts and tests for the user management features"
CLAUDE: [Figures out which agents to call, spawns them]
1. contract-generator → Creates YAML contracts from issue specs:
- docs/contracts/feature_architecture.yml (ARCH-001, ARCH-002)
- docs/contracts/feature_user_management.yml (USR-001, USR-002, USR-003)
- docs/contracts/journey_user_profile.yml (J-USR-PROFILE)
2. contract-test-generator → Creates Jest tests from YAML:
- src/__tests__/contracts/architecture.test.ts
- src/__tests__/contracts/user_management.test.ts
3. playwright-from-specflow → Creates Playwright tests from Gherkin:
- tests/e2e/user-profile.spec.ts
- tests/e2e/settings.spec.ts
4. journey-tester → Creates journey tests from journey contracts:
- tests/e2e/journeys/user-profile.journey.spec.ts
AGENT RETURNS:
Created:
- 3 YAML contracts (ARCH, FEAT, JOURNEY)
- 2 Jest test files (run at: npm test -- contracts)
- 3 Playwright test files (run at: npx playwright test)
Or be specific if you prefer:
YOU: "Run contract-generator on #42-#44, then contract-test-generator"
Afternoon: Build
YOU: "Map dependencies and show me the sprint waves"
CLAUDE: [Spawns dependency-mapper agent]
AGENT RETURNS:
Wave 0 (no dependencies): #44 (password reset - standalone)
Wave 1 (depends on #44): #42 (user profile), #43 (settings page)
YOU: "Execute wave 0, then wave 1. Last migration was 015."
CLAUDE: [Spawns sprint-executor, which dispatches:]
Wave 0:
- migration-builder for #44 → creates 016_password_reset.sql
Wave 1 (after 0 completes):
- migration-builder for #42 → creates 017_user_profile.sql
- migration-builder for #43 → creates 018_settings.sql
- frontend-builder for #42 → creates useProfile hook
- frontend-builder for #43 → creates useSettings hook
BUILD: npm test -- contracts
✓ ARCH-001: No direct DB calls in components
✓ ARCH-002: Auth context required
✓ USR-001: Profile data validation
✓ USR-002: Settings access control
✓ USR-003: Password requirements enforced
End of Day: Validate & Close
YOU: "Validate and check release readiness"
CLAUDE: [Spawns contract-validator, then journey-enforcer]
RETURNS:
ARCH/FEAT Contracts: All satisfied
Critical Journeys:
- J-USR-PROFILE: ✓ passing
- J-AUTH-LOGIN: ✓ passing
- J-CHECKOUT-FLOW: ⚠️ no test
Status: NOT READY (1 critical journey untested)
YOU: "Generate the checkout journey test"
CLAUDE: [Spawns journey-tester agent]
YOU: "Now close #42, #43, #44"
CLAUDE: [Spawns ticket-closer agent]
Who Generates What (And When)
| What | Agent | Input | Output | When to Run |
|---|---|---|---|---|
| YAML contracts | contract-generator |
Issue specs (Gherkin, SQL, journeys) | docs/contracts/*.yml |
After issues are specflow-compliant |
| Jest pattern tests | contract-test-generator |
YAML contracts | src/__tests__/contracts/*.test.ts |
After YAML contracts exist |
| Playwright feature tests | playwright-from-specflow |
Gherkin scenarios in issues | tests/e2e/*.spec.ts |
After implementation or alongside |
| Playwright journey tests | journey-tester |
Journey contracts in issues/epics | tests/e2e/journeys/*.journey.spec.ts |
After implementation or alongside |
The flow:
Issues with specs → YAML contracts → Jest tests → BUILD
↘
Gherkin/Journeys → Playwright tests → POST-BUILD
Key insight: Jest tests come from YAML contracts. Playwright tests come from issue content (Gherkin + journeys). Both can be generated before or after implementation — they test what SHOULD happen, not what currently exists.
The Three Layers
| Layer | What It Enforces | When It Runs | Failure Mode |
|---|---|---|---|
| ARCH | Structural invariants | npm test |
Build fails |
| FEAT | Feature requirements | npm test |
Build fails |
| JOURNEY | User flows work | Playwright | Release blocked |
Example ARCH contract:
- id: ARCH-001
forbidden_patterns:
- pattern: /supabase\.(from|rpc)/
message: "Components must use hooks, not direct DB calls"
scope: ["src/components/**/*.tsx"]
Example FEAT contract:
- id: USR-002
required_patterns:
- pattern: /validatePassword/
message: "Password validation required"
scope: ["src/features/auth/**/*.ts"]
Example JOURNEY contract:
journey_meta:
id: J-USR-PROFILE
dod_criticality: critical
steps:
- name: "User updates profile"
required_elements:
- selector: "[data-testid='profile-form']"
The Pattern
Every interaction:
YOU: [Direction] "Do X on issues Y"
↓
CLAUDE: [Spawns appropriate agent(s) via Task tool]
↓
AGENT: [Does work, returns result]
↓
YOU: [Review, next direction]
You’re directing traffic, not writing code.
Agent Teams Mode (Claude Code 4.6+)
Requires:
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=true
The default workflow above uses subagent mode — Claude Code spawns one-shot Task agents that do work and return. This is the standard approach and works everywhere.
Agent Teams mode adds a second option: persistent peer-to-peer teammates via the TeammateTool API. Instead of spawning disposable subagents, you spawn long-lived teammates that can communicate with each other.
When to Use Which
| Mode | Best For | How It Works |
|---|---|---|
| Subagent (default) | Most workflows | Task tool spawns agents → they work → they return results |
| Agent Teams | Complex multi-issue waves | Persistent teammates coordinate via sendMessage |
Team Agents
Agent Teams mode introduces 5 additional agent prompts:
| Agent | Role | File |
|---|---|---|
PROTOCOL |
Inter-agent communication rules | agents/PROTOCOL.md |
journey-gate |
Three-tier journey enforcement | agents/journey-gate.md |
issue-lifecycle |
Full issue lifecycle teammate | agents/issue-lifecycle.md |
db-coordinator |
Shared database resource manager | agents/db-coordinator.md |
quality-gate |
Test execution and quality checks | agents/quality-gate.md |
These agents exist alongside the standard subagents. In subagent mode they work as regular Task agents. In teams mode they become persistent teammates.
Three-Tier Journey Gates
Agent Teams introduces a three-tier quality gate system enforced by journey-gate:
| Tier | Scope | When | Blocks |
|---|---|---|---|
| Tier 1 | Single issue | Before closing an issue | Issue closure |
| Tier 2 | All wave issues | Before starting next wave | Next wave |
| Tier 3 | Full regression | Before merging to main | Merge |
Tier 3 uses .specflow/baseline.json for regression detection — comparing current test results against a known-good baseline.
Infrastructure Files
Agent Teams mode uses two additional files:
.specflow/baseline.json— Regression detection baseline (test pass/fail snapshot).claude/.defer-journal— Deferred test journal (replaces deprecated.defer-tests)
CI Pipeline
For teams mode, a 5-gate CI pipeline is available at .github/workflows/specflow-ci.yml:
Gate 1: Contracts → Gate 2: Build → Gate 3: Tier 2 → Gate 4: Tier 3 → Gate 5: Deploy
See agents/PROTOCOL.md for inter-agent communication patterns and agents/journey-gate.md for gate enforcement details.
Backward Compatibility
Agent Teams mode is fully backward compatible:
- Without the env var, everything works as subagent mode (default)
- All team agents work as regular Task agents in subagent mode
- No changes needed to existing workflows
- Existing
waves-controllersupports both modes
When Things Go Wrong
ARCH violation (build fails):
CONTRACT VIOLATION: ARCH-001 - direct DB call in component
src/components/UserCard.tsx:42 - "supabase.from('users')"
→ “Fix the violation — use a hook instead of direct call”
FEAT violation (build fails):
CONTRACT VIOLATION: USR-003 - password validation missing
src/features/auth/hooks/usePasswordReset.ts:42
→ “Add password validation to usePasswordReset”
JOURNEY violation (release blocked):
Step 3: Profile updated successfully → FAILED
Element not found: success-message
→ “The success message isn’t showing. Check the mutation response.”
Issue missing specs:
#56 Export Feature - Non-compliant (missing: Gherkin, SQL)
→ “Run specflow-writer on #56”
Checklist
Before sprint:
- All issues have Gherkin, SQL, RLS, TypeScript, journey refs
- YAML contracts generated (ARCH, FEAT, JOURNEY)
- Jest tests generated for ARCH/FEAT contracts
- Dependencies mapped into sprint waves
Before release:
npm test -- contractspasses (ARCH + FEAT)- All critical journey tests exist and pass
Agent Teams (if using teams mode):
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=trueset.specflow/baseline.jsonexists with known-good state- Tier 3 regression gate passes before merge
The Magic
You never write specs by hand. Agents generate them. You never write contracts by hand. Agents generate them. You never write tests by hand. Agents generate them. You never track dependencies by hand. SQL REFERENCES tells us.
You point. Agents work. Build catches mistakes. Release blocked if journeys fail.