Milhouse CLI

AI coding orchestrator. Diagnoses, plans, and executes correct work with evidence-based verification.
Milhouse is neither Bart (Auto Vibe Coder) nor Ralph (Auto Loop Coder) because he’s a correctness-only QA/planner/problemsolver.
Milhouse doesn’t invent new architecture like Bart, and he doesn’t just execute and move on like Ralph.
Milhouse verifies with evidence, aligns code with the real environment, and turns issues into safe, one-commit tasks with clear DoD and dependencies.

Installation

$3

- Node.js >= 18.0.0
- pnpm >= 9.0.0 (for development)
- Bun (for building binaries)

$3

``bash

`Install dependencies`


pnpm install
Run in development mode

pnpm dev
Run tests

pnpm test
Build binaries

pnpm build


$3
This project uses pnpm for package management and Bun for:
- Running TypeScript directly in development
- Building cross-platform binaries
- Running tests (bun:test)
$3

`bash

`Build all platforms`


pnpm build
Build specific platform

pnpm build:linux
pnpm build:mac-arm
pnpm build:mac-x64
pnpm build:windows

$3

`bash npm install -g milhouse-cli`

`Three Modes`

`$3`


Just tell it what to do:

bash
milhouse "add dark mode"
milhouse "fix the auth bug"


$3

Work through a PRD:

bash
milhouse              # uses PRD.md
milhouse --prd tasks.md


$3

Multi-agent investigation and execution:

bash
milhouse --scan --scope "frontend zustand"  # Creates isolated run
milhouse --validate                          # Validate issues
milhouse --plan                              # Generate tasks
milhouse --consolidate                       # Merge plans
milhouse --exec --exec-by-issue              # Execute grouped by issue (recommended!)
milhouse --verify                            # Verify results
Or run full pipeline (uses --exec-by-issue automatically)

milhouse --run


Investigation Pipeline
6-phase pipeline with specialized AI agents:
| Phase | Agent | Description |
|-------|-------|-------------|
| scan | LI (Lead Investigator) | Scans codebase, identifies issues |
| validate | IV (Issue Validators) | Validates with probes |
| plan | PL (Planners) | Generates WBS per issue |
| consolidate | CO (Consolidator) | Merges into unified plan |
| exec | EX (Executors) | Executes tasks |
| verify | VE (Verifiers) | Runs verification gates |
$3

Each scan creates isolated state:`bash milhouse --scan --scope "frontend" # Creates run-abc milhouse --scan --scope "backend" # Creates run-def

milhouse runs list # List all runs milhouse runs switch run-abc # Switch active run milhouse runs info # Show current run milhouse runs delete run-def # Delete a run`

`Project Config`

Optional. Stores rules the AI must follow.

`bash milhouse --init # auto-detects project settings milhouse --config # view config milhouse --add-rule "use TypeScript strict mode"`

Creates .milhouse/config.yaml:`yaml project: name: "my-app" language: "TypeScript" framework: "Next.js"

commands: test: "npm test" lint: "npm run lint" build: "npm run build"

rules: - "use server actions not API routes" - "follow error pattern in src/utils/errors.ts"

boundaries: never_touch: - "src/legacy/**" - "*.lock"`

`AI Engines`

`bash milhouse # Claude Code (default) milhouse --opencode # OpenCode milhouse --cursor # Cursor milhouse --codex # Codex milhouse --qwen # Qwen-Code milhouse --droid # Factory Droid`

`$3`

`bash milhouse --model sonnet "add feature" # use sonnet with Claude milhouse --sonnet "add feature" # shortcut for above milhouse --opencode --model opencode/glm-4.7-free "task"`

`Task Sources`

Markdown file (default):`bash milhouse --prd PRD.md`

Markdown folder (for large projects):`bash milhouse --prd ./prd/`Reads all.md files in the folder and aggregates tasks.

YAML:`bash milhouse --yaml tasks.yaml`

GitHub Issues:`bash milhouse --github owner/repo milhouse --github owner/repo --github-label "ready"`

`Parallel Execution`

`$3`

`bash milhouse --exec --exec-by-issue # Each issue in its own worktree milhouse --exec --exec-by-issue --max-parallel 3 # 3 issues in parallel`

How it works: - Groups all tasks by their parent issue - Each issue runs in an isolated worktree with a dedicated Claude agent - Agent receives: issue details + validation report + WBS plan + all tasks - Agent completes ALL tasks for that issue in one session - Branches auto-merge back after completion

Benefits: - Better context: Agent has full issue context, not just single task - Fewer context switches: One agent handles related tasks together - Faster overall: ~5 minutes per issue vs ~5 minutes per task

`$3`

`bash milhouse --parallel # 3 agents default milhouse --parallel --max-parallel 5 # 5 agents`

Each agent gets isolated worktree + branch. Without --create-pr: auto-merges back with AI conflict resolution. With --create-pr: keeps branches, creates PRs. With --no-merge: keeps branches without merging.

`Branch Workflow`

`bash milhouse --branch-per-task # branch per task milhouse --branch-per-task --create-pr # + create PRs milhouse --branch-per-task --draft-pr # + draft PRs`

`Browser Automation`

Milhouse supports browser automation via agent-browser for testing web UIs.

`bash milhouse "add login form" --browser # enable browser automation milhouse "fix checkout" --no-browser # disable browser automation`

When enabled (and agent-browser is installed), the AI can: - Open URLs and navigate pages - Click elements and fill forms - Take screenshots for verification - Test web UI changes after implementation

`Issue Filtering`

Milhouse supports filtering issues by ID and severity level at any pipeline stage.

`$3`

`bash

`Process only specific issues`


milhouse --validate --issues P-xxx,P-yyy,P-zzz
Exclude specific issues

milhouse --plan --exclude-issues P-xxx

$3

`bash

`Process only CRITICAL and HIGH severity issues`


milhouse --validate --severity CRITICAL,HIGH
Process issues with severity HIGH or above

milhouse --run --min-severity HIGH


$3
Severity levels in order of priority:
1. CRITICAL - Highest priority
2. HIGH
3. MEDIUM
4. LOW - Lowest priority
$3
Filters can be combined (AND logic):

`bash

`Validate specific issues that are also HIGH+ severity`


milhouse --validate --issues P-xxx,P-yyy --min-severity HIGH


Options

| Flag | What it does | |------|--------------| | Pipeline | | |--scan| Run Lead Investigator | |--scope FOCUS| Focus scan on specific area | |--validate| Validate issues with probes | |--plan| Generate WBS | |--consolidate| Merge into execution plan | |--exec| Execute tasks | |--verify| Run verification gates | |--run| Run full pipeline | |--resume| Resume from last phase | | Issue Filtering | | |--issues IDS| Comma-separated issue IDs to process | |--exclude-issues IDS| Comma-separated issue IDs to exclude | |--severity LEVELS| Filter by severity (CRITICAL,HIGH,MEDIUM,LOW) | |--min-severity LEVEL| Minimum severity level to process | | Tasks | | |--prd PATH| task file or folder (auto-detected, default: PRD.md) | |--yaml FILE| YAML task file | |--github REPO| use GitHub issues | |--github-label TAG| filter issues by label | | Engine | | |--model NAME| override model for any engine | |--sonnet | shortcut for --claude --model sonnet| | Execution | | |--parallel| run tasks in parallel (legacy, per-task) | |--exec-by-issue| execute tasks grouped by issue (recommended!) | |--max-parallel N| max parallel agents/issues (default: 3) | |--no-merge| skip auto-merge in parallel mode | |--branch-per-task| branch per task | |--base-branch BRANCH| base branch for PRs | |--create-pr| create PRs | |--draft-pr| draft PRs | |--worktrees| force worktree isolation | |--exec-fail-fast| stop on first task failure | | Testing | | |--no-tests| skip tests | |--no-lint| skip lint | |--fast| skip tests + lint | |--no-commit| don't auto-commit | |--browser| enable browser automation | |--no-browser| disable browser automation | | General | | |--max-iterations N| stop after N tasks | |--max-retries N| retries per task (default: 3) | |--retry-delay N| delay between retries in seconds (default: 5) | |--dry-run| preview only | |-v, --verbose| debug output | |--init| setup .milhouse/ config | |--config| show config | |--add-rule "rule" | add rule to config |

`Requirements`

- Node.js 18+ or Bun - AI CLI: Claude Code, OpenCode, Cursor, Codex, Qwen-Code, or Factory Droid -gh (optional, for GitHub issues / --create-pr`)

Links

- GitHub
- Discord

License

MIT

Milhouse CLI

Installation

$3

- Node.js >= 18.0.0
- pnpm >= 9.0.0 (for development)
- Bun (for building binaries)

$3

``bash

`Install dependencies`


pnpm install
Run in development mode

pnpm dev
Run tests

pnpm test
Build binaries

pnpm build


$3
This project uses pnpm for package management and Bun for:
- Running TypeScript directly in development
- Building cross-platform binaries
- Running tests (bun:test)
$3

`bash

`Build all platforms`


pnpm build
Build specific platform

pnpm build:linux
pnpm build:mac-arm
pnpm build:mac-x64
pnpm build:windows

$3

`bash npm install -g milhouse-cli`

`Three Modes`

`$3`


Just tell it what to do:

bash
milhouse "add dark mode"
milhouse "fix the auth bug"


$3

Work through a PRD:

bash
milhouse              # uses PRD.md
milhouse --prd tasks.md


$3

Multi-agent investigation and execution:

bash
milhouse --scan --scope "frontend zustand"  # Creates isolated run
milhouse --validate                          # Validate issues
milhouse --plan                              # Generate tasks
milhouse --consolidate                       # Merge plans
milhouse --exec --exec-by-issue              # Execute grouped by issue (recommended!)
milhouse --verify                            # Verify results
Or run full pipeline (uses --exec-by-issue automatically)

milhouse --run


Investigation Pipeline
6-phase pipeline with specialized AI agents:
| Phase | Agent | Description |
|-------|-------|-------------|
| scan | LI (Lead Investigator) | Scans codebase, identifies issues |
| validate | IV (Issue Validators) | Validates with probes |
| plan | PL (Planners) | Generates WBS per issue |
| consolidate | CO (Consolidator) | Merges into unified plan |
| exec | EX (Executors) | Executes tasks |
| verify | VE (Verifiers) | Runs verification gates |
$3

Each scan creates isolated state:`bash milhouse --scan --scope "frontend" # Creates run-abc milhouse --scan --scope "backend" # Creates run-def

milhouse runs list # List all runs milhouse runs switch run-abc # Switch active run milhouse runs info # Show current run milhouse runs delete run-def # Delete a run`

`Project Config`

Optional. Stores rules the AI must follow.

`bash milhouse --init # auto-detects project settings milhouse --config # view config milhouse --add-rule "use TypeScript strict mode"`

Creates .milhouse/config.yaml:`yaml project: name: "my-app" language: "TypeScript" framework: "Next.js"

commands: test: "npm test" lint: "npm run lint" build: "npm run build"

rules: - "use server actions not API routes" - "follow error pattern in src/utils/errors.ts"

boundaries: never_touch: - "src/legacy/**" - "*.lock"`

`AI Engines`

`bash milhouse # Claude Code (default) milhouse --opencode # OpenCode milhouse --cursor # Cursor milhouse --codex # Codex milhouse --qwen # Qwen-Code milhouse --droid # Factory Droid`

`$3`

`bash milhouse --model sonnet "add feature" # use sonnet with Claude milhouse --sonnet "add feature" # shortcut for above milhouse --opencode --model opencode/glm-4.7-free "task"`

`Task Sources`

Markdown file (default):`bash milhouse --prd PRD.md`

Markdown folder (for large projects):`bash milhouse --prd ./prd/`Reads all.md files in the folder and aggregates tasks.

YAML:`bash milhouse --yaml tasks.yaml`

GitHub Issues:`bash milhouse --github owner/repo milhouse --github owner/repo --github-label "ready"`

`Parallel Execution`

`$3`

`bash milhouse --exec --exec-by-issue # Each issue in its own worktree milhouse --exec --exec-by-issue --max-parallel 3 # 3 issues in parallel`

`$3`

`bash milhouse --parallel # 3 agents default milhouse --parallel --max-parallel 5 # 5 agents`

`Branch Workflow`

`bash milhouse --branch-per-task # branch per task milhouse --branch-per-task --create-pr # + create PRs milhouse --branch-per-task --draft-pr # + draft PRs`

`Browser Automation`

Milhouse supports browser automation via agent-browser for testing web UIs.

`bash milhouse "add login form" --browser # enable browser automation milhouse "fix checkout" --no-browser # disable browser automation`

`Issue Filtering`

Milhouse supports filtering issues by ID and severity level at any pipeline stage.

`$3`

`bash

`Process only specific issues`


milhouse --validate --issues P-xxx,P-yyy,P-zzz
Exclude specific issues

milhouse --plan --exclude-issues P-xxx

$3

`bash

`Process only CRITICAL and HIGH severity issues`


milhouse --validate --severity CRITICAL,HIGH
Process issues with severity HIGH or above

milhouse --run --min-severity HIGH


$3
Severity levels in order of priority:
1. CRITICAL - Highest priority
2. HIGH
3. MEDIUM
4. LOW - Lowest priority
$3
Filters can be combined (AND logic):

`bash

`Validate specific issues that are also HIGH+ severity`


milhouse --validate --issues P-xxx,P-yyy --min-severity HIGH


Options

`Requirements`

- Node.js 18+ or Bun - AI CLI: Claude Code, OpenCode, Cursor, Codex, Qwen-Code, or Factory Droid -gh (optional, for GitHub issues / --create-pr`)

Links

- GitHub
- Discord

License

MIT