Long Task Harness for AI agents - task/feature-driven development with external memory
npm install agent-foreman> Stop AI agents from half-building features. Ship complete code in one session.


AI coding agents face three common failure modes:
1. Doing too much at once - Trying to complete everything in one session
2. Premature completion - Declaring victory before features actually work
3. Superficial testing - Not thoroughly validating implementations
agent-foreman provides a structured harness that enables AI agents to:
- Maintain external memory via structured files
- Work on one feature at a time with clear acceptance criteria
- Hand off cleanly between sessions via progress logs
- Track impact of changes on other features
---
``bash`
/plugin install agent-foreman # 1. Install
/agent-foreman:init Build auth API # 2. Initialize
/agent-foreman:run # 3. Let AI work
---
`bashQuick install (binary)
curl -fsSL https://raw.githubusercontent.com/mylukin/agent-foreman/main/scripts/install.sh | bash
Manual download: GitHub Releases
---
Usage
$3
`
/plugin marketplace add mylukin/agent-foreman
/plugin install agent-foreman
`| Command | Description |
|---------|-------------|
|
/agent-foreman:status | View project status and progress |
| /agent-foreman:init | Initialize harness with project goal |
| /agent-foreman:analyze | Analyze existing project structure |
| /agent-foreman:spec | Transform requirements into tasks |
| /agent-foreman:next | Get next priority task |
| /agent-foreman:run | Auto-complete all pending tasks |Transform requirements into tasks:
`
/agent-foreman:spec Build a user authentication system
``
Requirement → [PM→UX→Tech→QA] → Spec Files → BREAKDOWN Tasks → /run → Implementation
`
CLI Commands (standalone)
For standalone CLI usage without Claude Code:
| Command | Description |
|---------|-------------|
|
init [goal] | Initialize or upgrade the harness |
| next [feature_id] | Show next feature to work on |
| status | Show current project status |
| check [feature_id] | Verify code changes or task completion |
| done | Verify, mark complete, and auto-commit |
| fail | Mark a task as failed |
| impact | Analyze impact of changes |
| tdd [mode] | View or set TDD mode |
| agents | Show available AI agents |
| install | Install Claude Code plugin |
| uninstall | Uninstall Claude Code plugin |---
Workflow
`
next → implement → check → done → repeat
`| Step | Command | What Happens |
|------|---------|--------------|
| 1 |
next | Get task with acceptance criteria |
| 2 | implement | Write code to satisfy criteria |
| 3 | check | Verify implementation |
| 4 | done | Mark complete, auto-commit |---
Best Practices
1. One feature at a time - Complete before switching
2. Update status promptly - Mark passing when criteria met
3. Review impact - Run impact analysis after changes
4. Clean commits - One feature = one atomic commit
5. Read first - Always check feature list and progress log
---
Reference
Core Files
| File | Purpose |
|------|---------|
|
ai/tasks/index.json | Task index with status summary |
| ai/tasks/{module}/{id}.md | Individual task definitions |
| ai/progress.log | Session handoff audit log |
| ai/init.sh | Environment bootstrap script |
| CLAUDE.md | AI agent instructions |
Status Values
| Status | Meaning |
|--------|---------|
|
failing | Not yet implemented |
| passing | Acceptance criteria met |
| blocked | External dependency blocking |
| needs_review | May be affected by changes |
| failed | Verification failed |
| deprecated | No longer needed |
Why It Works
AI agents need the same tooling that makes human teams effective:
| Human Practice | AI Equivalent |
|----------------|---------------|
| Scrum board |
ai/tasks/index.json |
| Sprint notes | progress.log |
| CI/CD pipeline | init.sh check` |---
MIT
Lukin (@mylukin)
---
Inspired by Anthropic's blog post: Effective harnesses for long-running agents