A self-improving AI development agent based on the Two-Loop Paradigm
npm install ophan

A self-improving AI development agent based on the Two-Loop Paradigm
Quick Start •
How It Works •
Documentation
---
> "Ophan" references the biblical Ophanim—"wheels within wheels" from Ezekiel's vision—representing nested loops that autonomously adapt while observing and learning continuously.
Most AI coding agents plateau: they fix bugs and generate code but make the same mistakes repeatedly. Ophan solves this by separating Guidelines (how to work) from Criteria (what good looks like).
``bashInstall
npm install -g ophan
Project Structure
After running
ophan init, your project will have:`
my-project/
├── OPHAN.md # Agent entry point
├── .ophan.yaml # Configuration
├── .ophan/
│ ├── guidelines/ # Agent CAN edit
│ │ ├── coding.md
│ │ ├── testing.md
│ │ ├── context.md # Context compilation patterns
│ │ └── learnings.md
│ ├── criteria/ # Agent CANNOT edit (protected)
│ │ ├── quality.md
│ │ ├── security.md
│ │ └── context-quality.md # Context agent metrics
│ ├── logs/ # Task execution logs
│ ├── context-logs/ # Context usage logs
│ ├── digests/ # Outer loop reports
│ └── state.json # Runtime state
└── [your project files]
`Key Concepts
$3
- Workflows and decision trees
- Data structures and templates
- Constraints and failure detection
- Agent can freely update these based on learnings$3
- Evaluation standards
- Analytical methods
- Comparative context
- Failure patterns
- Only humans can approve changes (prevents reward hacking)$3
The outer loop requires human oversight to approve criteria changes. This prevents the agent from lowering its own standards to achieve easier "success."Configuration
See
.ophan.yaml for all options:`yaml
model:
name: claude-sonnet-4-20250514
maxTokens: 4096innerLoop:
maxIterations: 5
regenerationStrategy: informed # full | informed | incremental
costLimit: 0.50
outerLoop:
triggers:
afterTasks: 10
minOccurrences: 3
minConfidence: 0.7
lookbackDays: 30
learnings:
maxCount: 50
retentionDays: 90
promotionThreshold: 3
similarityThreshold: 0.9
escalations:
webhooks:
- name: slack-alerts
url: ${SLACK_WEBHOOK_URL}
events: [escalation, digest]
`Commands
| Command | Description |
|---------|-------------|
|
ophan init | Initialize Ophan in current project |
| ophan task " | Run a task through inner loop |
| ophan review | Run outer loop (pattern detection) |
| ophan status | Show metrics and status |
| ophan logs | View recent task logs |
| ophan ui | Open web UI for configuration and monitoring |
| ophan approve | Approve a criteria change proposal |
| ophan context-stats | View context usage statistics |$3
ophan init
- -t, --template — Template to use (base, typescript, python)
- -f, --force — Overwrite existing configuration
- -y, --yes — Skip confirmation prompts
- -p, --project — Path to the project directory
ophan task
- -n, --dry-run — Show what would be done without executing
- -m, --max-iterations — Override max iterations
- -p, --project — Path to the project directory
ophan review
- -f, --force — Run even if task threshold not reached
- --auto — Auto-approve guideline changes (criteria still require approval)
- --non-interactive — Skip interactive review, save proposals to pending
- --pending — Review pending proposals from previous runs
- -p, --project — Path to the project directory
ophan context-stats
- -d, --days — Number of days to analyze (default: 30)
- --json — Output as JSON
- -p, --project — Path to the project directory
ophan logs
- -l, --limit — Number of logs to show (default: 10)
- -p, --project — Path to the project directory
- --json — Output as JSON
ophan ui
- -p, --port — Port to run the server on (default: 4040)
- --no-open — Do not open browser automatically
- --project — Path to the project directoryWeb UI
Ophan includes a lightweight web dashboard for viewing status and editing configuration.
`bash
Start the UI (opens browser automatically)
ophan uiStart on a different port
ophan ui --port 8080Start without opening browser
ophan ui --no-open
`The UI provides:
- Dashboard: View task metrics, success rates, and costs
- Task Logs: Browse and search task execution history
- Configuration: Edit settings with form-based interface
- Guidelines/Criteria: View current guidelines and criteria files
- Digests: Read outer loop review reports
Escalations & Webhooks
Ophan can send notifications when tasks escalate (hit max iterations, exceed cost limits, etc.) or when outer loop digests are generated.
$3
- escalation — Task failed to converge
- task_complete — Task finished (success or failure)
- digest — Outer loop review completed$3
`json
{
"type": "escalation",
"timestamp": "2024-01-15T10:30:00Z",
"task": {
"id": "task-20240115-103000-abc1",
"description": "fix the login bug",
"iterations": 5,
"maxIterations": 5
},
"reason": "max_iterations",
"context": {
"lastError": "Test failed: expected 200, got 401",
"suggestedAction": "Review task complexity or improve guidelines"
},
"project": {
"name": "my-app",
"path": "/Users/dev/my-app"
}
}
`$3
Use
${VAR_NAME} syntax for secrets:`yaml
escalations:
webhooks:
- name: slack
url: ${SLACK_WEBHOOK_URL}
headers:
Authorization: Bearer ${AUTH_TOKEN}
events: [escalation]
`Prerequisites
Ophan uses Claude Code (subscription-based) for task execution:
1. Install Claude Code CLI
2. Authenticate with your Claude subscription
| Variable | Required | Description |
|----------|----------|-------------|
| Webhook URLs/tokens | No | As configured in
.ophan.yaml |How It Works
$3
Ophan includes a self-improving context agent that learns which files are relevant for different tasks. After each task, it tracks:
- Hit Rate: % of provided files that were actually used (target: >70%)
- Miss Rate: % of used files that weren't provided (target: <20%)
Over time, the context agent proposes updates to context guidelines based on usage patterns. View statistics with
ophan context-stats.$3
1. Context Building: Loads guidelines, criteria, and previous learnings
2. Agent Execution: Claude executes the task using available tools
3. Evaluation: Output is evaluated against criteria and dev tools (tests, lint, build)
4. Learning: If evaluation fails, learnings are extracted
5. Regeneration: Agent regenerates with updated understanding
6. Convergence: Repeats until evaluation passes or max iterations reached
$3
1. Log Analysis: Analyzes task logs from the lookback period
2. Pattern Detection: Identifies failure, iteration, and success patterns
3. Learning Consolidation: Deduplicates, promotes, and prunes learnings
4. Guideline Updates: Auto-applies updates from promoted learnings
5. Proposal Generation: Creates proposals for criteria changes (require approval)
6. Digest Generation: Writes summary report to
.ophan/digests/$3
- Failure Patterns: Recurring errors (TypeScript, tests, lint)
- Iteration Patterns: Tasks consistently needing multiple iterations
- Success Patterns: Approaches that work well consistently
Development
`bash
Install dependencies
npm installRun tests
npm testType check
npm run typecheckBuild
npm run buildRun CLI locally
npm run dev -- task "your task"
``For detailed documentation, see:
- Architecture Overview — System design with Mermaid diagrams
- Inner Loop — Per-task execution engine
- Outer Loop — Pattern detection and learning consolidation
- Configuration Reference — Complete config options
- Phase 1A: Core Infrastructure — CLI, config, types, scaffolding
- Phase 1B: Inner Loop — Task execution, Claude API, evaluation, regeneration
- Phase 1C: Outer Loop — Pattern detection, learning consolidation, proposals
- Phase 1D: Escalations — Webhook notifications
- Phase 1E: Polish — Testing, documentation
- Phase 1F: Web UI — Dashboard, config editor, log viewer
Test Coverage: 87 tests passing
MIT