AI-driven spec-based development workflow
npm install dev-playbooks

Transform AI coding from "says it's done" to "proves it's done."
DevBooks is an engineering protocol for AI that uses upstream Single Source of Truth (SSOT), executable gates, and evidence loops to upgrade AI programming from "conversational guessing" to "auditable engineering delivery."
---
The essence of software engineering is building reliable systems on unreliable components. Traditional engineering uses RAID against unreliable disks, TCP retransmission against unreliable networks, and Code Review against unreliable human programmers. AI engineering likewise needs gates and evidence loops to constrain unreliable LLM outputs.
DevBooks is not prompt optimization. It's engineering constraints.
---
At the core of DevBooks is the Single Source of Truth (SSOT)—all critical knowledge persisted and versioned, stable across conversations and changes.
```
Your requirement docs (if any)
↓ Extract constraints, build index
specs/ (terms, boundaries, decisions, scenarios) ← "Project memory" stable across changes
↓ Derive change packages
changes/
↓ Archive writeback
specs/ (update truth)
Problems SSOT Solves:
| Problem | Root Cause | How SSOT Solves It |
|---------|------------|-------------------|
| Re-teach every time | Conversations are temporary, knowledge isn't persisted | Terms, boundaries, constraints written in files, not dependent on conversation memory |
| Forgets what you said earlier | Context window is limited, early info gets pushed out | Truth artifacts persisted, critical constraints force-injected |
| Don't know what changed | No auditable change record | Every change has complete record—proposal, design, tasks, evidence |
---
DevBooks continuously tracks delivery status through Completion Contracts and Requirements Index.
`yaml`
obligations:
- id: O-001
describes: "User can login via email"
severity: must
checks:
- id: C-001
type: test
covers: [O-001]
artifacts: ["evidence/gates/login-test.log"]
Not "roughly done," but "all 5 obligations have evidence."
`yaml`
set_id: ARCH-P3
source_ref: "truth://specs/architecture/design.md"
requirements:
- id: R-001
severity: must
statement: "All APIs must support versioning"
- id: R-002
severity: should
statement: "Response time < 200ms"
When a change package claims "upstream task completed," the system can judge that claim—not verbal confirmation, but machine verification.
---
What happens when you hand large requirements directly to AI? Fixes A, breaks B. Fixes B, breaks C.
The Knife protocol uses complexity budgets and topological sorting to slice Epics into independently verifiable atomic change package queues.
``
Score = w₁·Files + w₂·Modules + w₃·RiskFlags + w₄·HotspotWeight
| Signal | Weight | Description |
|--------|--------|-------------|
| files_touched | 1.0 | 1 point per file |
| modules_touched | 5.0 | Cross-module risk is high |
| risk_flags | 10.0 | 10 points per risk flag |
| hotspot_weight | 2.0 | High churn areas weighted |
Over budget means slice again—no "forcing through."
1. MECE Coverage: Union of all slice acceptance criteria equals Epic's complete set, no overlap
2. Independently Green: Each slice has at least one deterministic verification anchor, no "intermediate state won't compile"
3. Topologically Sortable: Dependency graph must be acyclic, execution order must be topological
4. Budget Circuit Breaker: Over budget must recursively re-slice, or flow back for more info
When a Knife Plan contains multiple Slices, you can generate a parallel execution schedule:
`bash`
knife-parallel-schedule.sh
Output contents:
- Maximum Parallelism: Max number of Agents that can start simultaneously
- Layered Execution Schedule: Layer 0 (no deps) → Layer 1 → Layer N
- Critical Path: Serial dependency depth
- Launch Command Templates: Agent launch command for each Slice
#### Autoflow (Beta): Control Plane + tmux/worktree Script Generation
To reduce manual overhead ("open windows / copy-paste / resume"), use the CLI to generate a control plane and helper scripts (safe-by-default; does not execute AI):
`bashGenerate schedule + dashboard + runbook (derived cache; safe to delete/rebuild)
dev-playbooks autoflow --epic
Notes:
- Autoflow only generates guidance and scripts. It does not auto-run external AI CLIs, bypass permissions, or auto-merge.
- Test Owner and Coder must still run in isolated sessions/instances. For parallel work, use worktrees (a helper script is generated in Beta mode).
---
7 Gates: Full-Chain Judgeable Checkpoints
| Gate | What It Checks | Failure Consequence |
|------|----------------|---------------------|
| G0 | Is input ready? Are baseline artifacts complete? | Flow back to Bootstrap |
| G1 | Are all required files present? Is structure correct? | Block |
| G2 | Are all tasks complete? Does green evidence exist? | Block |
| G3 | Is slicing correct? Are anchors complete? (large requests) | Flow back to Knife |
| G4 | Are docs in sync? Are extension packs complete? | Block |
| G5 | Is risk covered? Is rollback strategy present? (high-risk) | Block |
| G6 | Is evidence complete? Is contract satisfied? Ready to archive? | Block |
Any failure blocks the flow. Not a warning. A block.
---
Role Isolation: Prevent AI from Validating Itself
| Role | Responsibility | Hard Constraint |
|------|---------------|-----------------|
| Test Owner | Derive acceptance tests from design | Cannot see implementation code |
| Coder | Implement features per tasks | Cannot modify tests/ |
| Reviewer | Review readability and consistency | Cannot change tests or design |
Test Owner and Coder must execute in different contexts—not "different people," but "different conversations/instances." They can only exchange information through persisted artifacts.
---
Quick Start
`bash
npm install -g dev-playbooks
dev-playbooks init
`Then, in your AI tool chat, type (the single entry):
`text
/devbooks:delivery
`> Note:
/devbooks:delivery maps to the Router Skill devbooks-start. It runs “demand alignment → SSOT staged advancement”, then routes request_kind and orchestrates the minimal sufficient closed loop.
>
> Optional: view entry guidance in the terminal (does not run AI): dev-playbooks delivery`
Your request
↓
Start (demand alignment → SSOT staging → route request_kind → generate RUNBOOK)
↓
┌─────────────────────────────────┐
│ Small change → Execute directly │
│ Large request → Slice first │
│ Uncertain → Research first │
└─────────────────────────────────┘
↓
Gate checks (7 checkpoints, any failure blocks)
↓
Evidence archive (test logs, build outputs, approvals)
`---
Directory Structure
`
project/
├── .devbooks/config.yaml # Config entry point
└── dev-playbooks/
├── constitution.md # Hard constraints (non-bypassable rules)
├── specs/ # Truth source (SSOT)
│ ├── ssot/ # Project SSOT pack (requirements layer)
│ │ ├── SSOT.md
│ │ ├── requirements.index.yaml
│ │ └── requirements.ledger.yaml # Derived cache (discardable/rebuildable)
│ ├── _meta/
│ │ ├── glossary.md # Unified language
│ │ ├── boundaries.md # Module boundaries
│ │ ├── capabilities.yaml # Capability registry
│ │ └── epics/ # Knife slice plans
│ └── ...
└── changes/ # Change packages
└── /
├── proposal.md # Why and what
├── design.md # How, acceptance criteria
├── tasks.md # Executable steps
├── completion.contract.yaml # Completion contract
├── verification.md # How to prove it's correct
└── evidence/ # Test logs, build outputs
``---
- Brownfield Onboarding: Auto-index existing docs, extract judgeable constraints, establish minimal SSOT package
- Greenfield Projects: Guide completion of terms, boundaries, scenarios, decisions to establish baseline
- Daily Changes: Minimal sufficient loop with reproducible verification anchors + evidence archive
- Large Refactors: Knife slicing + migration patterns (Expand-Contract / Strangler Fig / Branch by Abstraction)
---
- Quick Start
- AI-Native Workflow
- Skill Reference
---
MIT