The Ultimate Guide to Spec-Driven Development Frameworks

December 13, 202512 min read
AIEngineeringAutomationClaude Code
The Ultimate Guide to Spec-Driven Development Frameworks

In this age of agentic coding, I'm a firm believer in spec-driven development. How else are you gonna get the AI gnomes to do your bidding while you finish your third run-through of Breaking Bad?

What is spec-driven development?

People are saying specs are now the uncompiled input and our code the compiled output. Want to change your code? Change your spec and have AI make the necessary changes.

People complain about AI just not working for their codebase. I'd argue they just don't have the right specs and process. Our unique power as engineers is to bring our decades of experience to craft excellent specs that agents can execute without issue. Until agents get so good that they don't need it, this is the reality we need to face.

I've been building AgentCMD and in order to ensure it can handle robust, many-phased specs, I've been using a three command spec implementation system. I built and tested this approach after evaluating all of the existing options.

There's some great stuff out there. Let me break down what I found.


The Landscape: Five Spec Systems Compared

After analyzing four mature spec systems and building my own, here's what I learned.

The TL;DR:

  • Everyone converged on the same patterns: hidden directories, Markdown-first, three-state workflows. Great minds and all that.
  • Timestamp IDs are the way for teams. I'm convinced.
  • Complexity estimation is still uncommon. Taskmaster and AgentCMD are the only ones with it built-in.
  • Agent OS has the coolest innovation: spec-lite.md for AI token optimization. Brilliant.
  • Cherry-pick the best ideas: spec-lite, pre-spec clarification, modular standards, delta tracking, batch operations.

1. Taskmaster AI - PRD-First Approach

Created by: Eyal Toledano (@eyaltoledano) and @RalphEcom

Eyal is the CEO of Hamster, focused on "eliminating AI context loops." He also invests through Microangel. The project exploded from 0 to 15,500 GitHub stars in just 9 weeks. Now sitting at 24,000+ stars. That's not growth, that's a rocketship. Clearly struck a nerve with the AI-native dev crowd.

Philosophy: Start with a detailed Product Requirements Document, derive all tasks from it.

Their Angle: "An AI-powered task-management system you can drop into Cursor, Lovable, Windsurf, Roo, and others." The key word is "drop" - it's designed to slot into your existing workflow, not replace it. They're betting on PRDs as the universal starting point and natural language as the interface.

Taskmaster AI GitHub

Key Innovations:

  • MCP Integration: ~21,000 tokens, 36 tools for deep IDE integration
  • Research Model: AI can query fresh information with project context
  • Complexity Analysis: Built-in analyze-complexity command with 1-10 scoring
  • Natural Language First: Commands are conversational, not rigid
  • Batch Operations: Comma-separated IDs (1,3,5)
  • Dependency Tracking: --with-dependencies flag moves related tasks

Strengths:

  • PRD ensures comprehensive upfront planning
  • MCP provides deepest IDE integration
  • Natural language lowers barrier to entry
  • Dependency awareness prevents orphaned work
Get Started with Taskmaster AI

2. OpenSpec - Delta-First Approach

Created by: Tabish (@TabishB) at Fission AI

The project has hit 12,000+ stars on GitHub. Tabish and the team famously use OpenSpec to build OpenSpec. Eating your own dogfood at its finest. Follow @0xTab for updates.

Philosophy: Explicit change proposals with clear before/after documentation.

Their Angle: "Align humans and AI coding assistants with spec-driven development so you agree on what to build before any code is written." The emphasis here is agreement and determinism. Lock in the intent before you touch a line of code. They're explicitly fighting the chaos of requirements buried in chat history. We've all been there. "Wait, did we decide on JWT or sessions? Let me scroll back through 400 messages..."

OpenSpec Dashboard

Key Innovations:

  • Explicit Delta Tracking: ADDED/MODIFIED/REMOVED sections show exactly what changed
  • Separation of Truth vs Proposal: specs/ = current, changes/specs/ = proposed
  • Tool-Agnostic Design: Works with Claude Code, Cursor, Amp, Jules, etc.
  • Archival Merges Deltas: Automated consolidation of changes back to source specs

Strengths:

  • Delta tracking provides clearest audit trail
  • Separation prevents accidental modification of truth
  • Human-readable throughout (no JSON overhead)
  • Tool-agnostic via AGENTS.md pattern
Get Started with OpenSpec

3. SpecKit - Constitution-First Approach

SpecKit Logo

Created by: Den Delimarsky (@localden) and John Lam at GitHub

This one has serious pedigree. John Lam is a legend in the developer tools space. He created RubyCLR and led the IronRuby project at Microsoft before moving to GitHub to work on AI coding experiences. His research notes on steering LLM development informed the entire project. When GitHub ships something, you pay attention. Den Delimarsky has been deep in developer experience at GitHub for years.

Philosophy: Establish foundational principles, then generate executable specifications.

Their Angle: "If you are able to clearly articulate your requirements, you will get better outcomes." They're positioning hard against "vibe coding," the idea that you can just wing it with AI. The key insight? Treat specifications as "executable artifacts" that remain separate from technical implementation. Specs focus on the what and why, not the how. I dig this philosophy.

Key Innovations:

  • Constitutional Governance: Foundational doc ensures cross-feature consistency
  • Executable Specs: Specifications directly generate implementations
  • Quality Validation: /speckit.analyze checks consistency
  • Explicit Clarification Phase: /speckit.clarify identifies gaps before implementation
  • "Unit tests for English": Checklists validate spec completeness

Strengths:

  • Constitutional approach prevents drift across features
  • Quality validation catches issues early
  • Clarification phase reduces rework
Get Started with SpecKit

4. Agent OS - Standards-First Approach

Created by: Brian Casel (@CasJam) at Builder Methods

Brian is a serial entrepreneur who's been building and selling products since 2008. He created Restaurant Engine, Audience Ops (acquired 2021), and ClarityFlow. In 2025, he went full AI-first and started Builder Methods to help professional developers work with AI. He runs a great newsletter and YouTube channel on the topic. Agent OS hit 1,000+ stars within 6 weeks of release and is now at 2,800+ stars.

Philosophy: Capture organizational standards and coding patterns as executable specifications that AI agents automatically follow.

Their Angle: "Transforms AI coding agents from confused interns into productive developers." The pitch is about making AI agents "build your way, not their way." They're betting that the real unlock isn't just specs, but capturing your team's coding standards so AI can "ship quality code on the first try - not the fifth."

Agent OS GitHub

Key Innovations:

  • Dual Installation Model: Base (~) + Project (.) separation
  • Standards-as-Code: Modular markdown files with injection system
  • Spec + Spec-Lite Pattern: Full documentation + AI-optimized condensed version
  • Pre-Specification Clarification: /shape-spec phase before /write-spec
  • Profile Inheritance: Layered standards (default, then general, then tech-specific)
  • Product-Level Context: product.md as foundation for all specs

Strengths:

  • Standards system is most comprehensive
  • Spec-lite pattern optimizes for AI context windows
  • Pre-spec clarification prevents underspecification
  • Works across all major AI coding tools
Get Started with Agent OS

5. AgentCMD - Three Command Simplicity

Created by: JP Narowski at AgentCMD

That's me. I manage 30+ engineers at Spectora and got tired of babysitting my AI agents. After evaluating all these systems, I decided to build something that prioritized automation over ceremony. Read the full story in Why I Built AgentCMD.

Philosophy: Minimal ceremony, maximum automation. Generate, implement, review. That's it.

My Angle: The other systems are great for planning, but I kept finding myself calling implement 10+ times on complex specs. I needed something that could orchestrate itself. The JSON responses, timestamp IDs, and complexity estimation all exist to enable automation, so you can kick off a spec and walk away.

AgentCMD Website

Key Innovations:

  • JSON Index: Performance optimization for spec lookups
  • Complexity Estimation: Context-based scoring (1-10 scale) with automation focus
  • Anti-Sycophancy Review: Catches AI "claiming done when it's not"
  • Recursive Implementation: Safe to call repeatedly until complete

Strengths:

  • Simplest command structure (3 vs 5-6 in other systems)
  • Built for automation and orchestration
  • Timestamp IDs prevent all collisions
Learn More About AgentCMD

Read "Why I Built AgentCMD" for the full story


What Everyone Agrees On

Here's the interesting part: all five systems independently landed on the same core patterns. No coordination, just convergence.

Hidden directories. Everyone uses a dedicated folder (.taskmaster/, openspec/, .agent/, etc.) to keep specs separate from source code. Keeps your IDE clean.

Markdown everything. Not JSON. Not YAML. Plain ol' Markdown. It's human-readable, version-controllable, and AI agents parse it naturally.

Three workflow states. Draft, Active, Done. Every system has some version of this. It matches how our brains work.

Checkbox tracking. - [ ] and - [x] for task state. Simple, version-controlled, works everywhere.

A foundational doc. Every system has one central file that grounds everything: Taskmaster has prd.txt, OpenSpec has AGENTS.md, SpecKit has constitution.md, Agent OS has product.md, and AgentCMD reads from CLAUDE.md.


Feature Comparison Matrix

FeatureTaskmasterOpenSpecSpecKitAgent OSAgentCMD
Complexity EstimationYes---Yes
JSON Index----Yes
Spec-Lite (AI optimized)---Yes-
Delta Tracking-Yes---
Pre-Spec Clarification--YesYes-
Constitutional/StandardsPRDAGENTS.mdConstitutionStandardsCLAUDE.md
Batch OperationsYes----
Anti-Sycophancy Review----Yes
Tool Agnostic-Yes-Yes-

Recommended Approach by Team Size

Solo Developer

Recommendation: Taskmaster AI or AgentCMD

  • Simple IDs are fine when you're the only one
  • Natural language (Taskmaster) or minimal commands (AgentCMD) reduce friction

Small Team (2-5)

Recommendation: AgentCMD or OpenSpec

  • Timestamp IDs prevent coordination overhead
  • Delta tracking (OpenSpec) helps with code review

Larger Team (5+)

Recommendation: Agent OS or AgentCMD

  • Standards system (Agent OS) ensures consistency
  • JSON index (AgentCMD) enables automation at scale

Enterprise/Distributed

Recommendation: AgentCMD

  • Timestamp IDs are globally unique
  • Complexity estimation helps planning
  • JSON automation enables CI/CD integration

Key Takeaways

  1. All roads lead to Markdown. Every system uses Markdown-first specs. The industry has spoken.
  2. Three states are universal. Draft, Active, Done. Everyone landed here independently.
  3. ID strategy matters more than you think. Pick based on your team size and coordination capacity.
  4. Complexity estimation is uncommon. Taskmaster and AgentCMD are the only ones with it built-in. Worth considering for planning.
  5. AI optimization is emerging. Agent OS's spec-lite pattern is brilliant. I wouldn't be surprised if everyone adopts something similar.
  6. Anti-sycophancy is crucial. Review commands catch AI overconfidence. Without them, your agents will gaslight you into thinking they're done when they're absolutely not.

The best system is the one you'll actually use. Start simple, add complexity only when needed. And hey, stop vibe coding your production apps. Your future self will thank you.


Getting Started

Want to try AgentCMD's three-command system? Run npx agentcmd init in your project. It takes about 30 seconds.

If you're interested in trying it out or have questions, feel free to reach out on Twitter. And if you're also tired of helicopter-parenting your agents, welcome to the club.


References

Get More Tactical Insights

Join readers getting weekly tactics on agentic coding and AI leadership. Unsubscribe anytime.

View All Articles