Stop Trusting. Start Verifying.
Specflow enforces architectural contracts like a compiler enforces types.
Specflow is a contract-driven development system for teams building with LLMs. It ships with default security, accessibility, and production-readiness gates (adapted from forge by Ikenna N. Okpala), 23+ specialized agents that execute your GitHub backlog in parallel waves, and self-healing fix loops that auto-repair violations. When code breaks a contract, the build fails. When the build fails, agents fix it.
The Compiler Analogy
| Most Workflows | Specflow |
|---|---|
| Intent → Prompt | Intent → Contract |
| → Hope | → Generate |
| → Review → Fix | → Test → Stop or Ship |
| Trust the middle | Verify the boundary |
TypeScript rejects type errors. Specflow rejects architecture errors.
Before LLMs, humans were slow at writing code. We built tools to speed them up: IDEs, linters, autocomplete.
After LLMs, humans are slow at reviewing code. We need tools to enforce boundaries automatically.
Specflow is that tool.
Why Now?
Before LLMs (Pre-2023)
- ✅ Execution was deterministic
- ✅ Humans were the bottleneck (slow generation)
- ✅ Review scaled poorly but worked
- ✅ Violations were rare bugs
Old tools were sufficient: Design by Contract, TDD, linters, code review
After LLMs (2023+)
- ⚠️ Generation is infinite
- ⚠️ Execution is probabilistic
- ⚠️ Humans cannot keep up (slow review)
- ⚠️ Drift is invisible until too late
- ⚠️ Violations are normal behavior
Old tools are insufficient. Specflow fills the gap.
How It Works
Specflow uses contracts to define architectural rules:
# Feature Contract Example
contract_type: feature
feature_name: leave_management
invariants:
- id: LEAVE-001
rule: "Leave approval MUST debit from leave_entitlements ledger"
severity: critical
enforcement: e2e_test
When code violates a contract, the build fails — just like a type error.
The Agent-First Workflow
Specflow isn’t just a framework. It’s a methodology powered by 23+ specialized agents:
- Define contracts (what must hold true)
- Run
waves-controller(orchestrates agent execution) - Agents generate implementation + tests
- Tests enforce contracts automatically
- Ship or stop (no manual review needed)
3-4x faster than manual workflows. Proven on production projects.
New: Agent Teams — Persistent peer-to-peer teammates with three-tier journey gates (requires Claude Code 4.6+).
What You Get
Core Framework
- Feature Contracts: Architectural rules that must hold (invariants)
- Journey Contracts: End-to-end workflows that define “done”
- Automated Testing: Playwright E2E tests enforce contracts
- Journey Verification Hooks: Automatic E2E execution at build boundaries
Quality Gates (Out of the Box)
- Security Gates: OWASP Top 10 coverage — hardcoded secrets, SQL injection, XSS, eval, path traversal (SEC-001..005)
- Accessibility Gates: WCAG AA basics — alt text, aria-labels, form labels, focus order (A11Y-001..004)
- Production Readiness: No demo data, placeholder domains, or hardcoded IDs in production (PROD-001..003)
- Test Integrity: No mocking in E2E tests, no placeholder assertions, no swallowed errors (TEST-001..005)
Quality gates adapted from forge by Ikenna N. Okpala. See Acknowledgments.
Self-Healing & Learning
- Self-Healing Fix Loops: Autonomous violation repair with confidence-tiered fix patterns (Platinum → Bronze)
- Post-Mortem Learning: Violations get recorded, fixes get stored, agents get warned before repeating mistakes
- CI Feedback Loop: Automatic CI status reporting after every git push
Agent System
- 23+ Specialized Agents: Complete delivery pipeline from spec to ship
- Model Routing: Haiku/Sonnet/Opus routing per agent — 40-60% cost savings
- Agent Teams: Persistent peer-to-peer teammates with three-tier journey gates (Claude Code 4.6+)
- DPAO Orchestration: Discovery → Parallel → Analysis → Orchestration
Portable Adoption
- SKILL.md: Single-file portable skill — drop one file into any project for instant Specflow
- Full Agent Library: 23+ agents for graduated adoption
- Academic Foundation: 40+ years of CS research (DbC, Property Testing, MDE)
Journey Verification Hooks
Problem: You forget to run E2E tests. Production breaks.
Solution: Hooks make Claude run tests automatically.
| Without Hooks | With Hooks |
|---|---|
| You: “Run tests” | [HOOK fires automatically] |
| You forget → prod breaks | Can’t forget |
| “Tests passed” (vague) | WHERE/WHAT/HOW MANY (explicit) |
# Install hooks
bash install-hooks.sh /path/to/project
Hooks trigger at build boundaries:
- PRE-BUILD → Run baseline (LOCAL)
- POST-BUILD → Verify changes (LOCAL)
- POST-COMMIT → Verify production (PRODUCTION URL)
Quick Start
Option A: Single File (Fastest)
cp Specflow/SKILL.md your-project/
Then tell Claude Code: /specflow — the skill activates the core methodology with security, accessibility, and production readiness gates included.
Option B: Full Agent Library
# 1. Add Specflow to your project
git clone https://github.com/Hulupeep/Specflow.git
cp -r Specflow/agents/ your-project/scripts/agents/
# 2. Copy default contract templates
cp Specflow/templates/contracts/*.yml your-project/docs/contracts/
# 3. Install hooks
bash Specflow/install-hooks.sh your-project/
# 4. Tell Claude Code
# "Execute waves"
Option C: One Prompt (Zero Setup)
Read Specflow/README.md and set up my project with Specflow agents
including updating my CLAUDE.md. Then execute my backlog in waves.
The Intellectual Foundation
“We already knew how to control untrusted execution. We just forgot — until LLMs forced us to remember.”
Specflow synthesizes:
- Design by Contract (Eiffel, 1986) — preconditions, postconditions, invariants
- Property-Based Testing (QuickCheck, 2000) — properties for systems
- Static Analysis (Lint, 1970s) — custom rules in YAML
This isn’t new theory. It’s old theory applied to a new failure mode: probabilistic code generation.
The Compiler Doesn’t Trust Your Types. Why Trust Your Architecture?
Start enforcing boundaries today.
Acknowledgments – Specflow builds on the work of others.