About Specflow
The story behind Specflow and why it exists.
Why I Built This
I build products for a living—for clients and for myself. When you’re shipping real software to real users, you need things that actually work. Not “works in demo.” Not “works on my machine.” Works in production, under load, when users do unexpected things.
After years of watching LLMs confidently break production code, I started developing patterns to catch drift before it shipped. Specflow is the result of trying these patterns on numerous projects—some successful, some painful lessons.
This is still a work in progress. The core idea (contracts that enforce themselves) is solid and battle-tested. The tooling keeps evolving as I discover what works and what doesn’t across different stacks and team sizes.
The Philosophy
Most “AI guardrails” try to make LLMs behave better. That’s backwards.
You can’t fix probabilistic outputs with better prompts—you need a gate that stops bad outputs from shipping.
| Approach | Problem |
|---|---|
| “Write better prompts” | LLMs still drift |
| “Review everything” | Humans can’t keep up |
| “Trust but verify” | Verification happens too late |
Specflow’s approach: Don’t trust the LLM. Don’t trust the human either. Trust the tests.
If the contract test passes, ship it. If it fails, fix it. Simple as that.
What Makes This Different
1. Enforcement, Not Guidelines
Most spec systems create documents that LLMs ignore. Specflow creates tests that fail builds.
Spec document → LLM reads → LLM forgets → Code drifts → Production breaks
Contract test → Code violates → Build fails → Must fix → Production protected
2. Built for Real Projects
This isn’t academic theory. Every pattern in Specflow came from production pain:
- Journey contracts emerged after an LLM “optimized” a checkout flow into a broken mess
- Wave execution came from manually coordinating 40+ GitHub issues
- Verification hooks were born from a 24-hour production outage caused by “tests passed” (they didn’t run)
3. Still Evolving
I’m actively using Specflow on client projects and my own products. When something doesn’t work, I fix it. When I find a better pattern, I add it.
The core is stable. The edges are experimental. That’s by design.
Connect
I’d love to hear how you’re using Specflow—what works, what doesn’t, what you wish existed.
- FloutLabs: floutlabs.com — AI Studio
- LinkedIn: Colm Byrne
- GitHub: Hulupeep
- Issues/Ideas: Specflow Issues
- Discussions: Specflow Discussions
The best improvements come from production feedback. If you’re shipping real software with Specflow, your experience matters.
License
MIT - Use freely, commercially, anywhere.
No strings attached. If it helps you ship better software, that’s the goal.
Made for developers who want specs that actually matter.