Reduce LLM token usage by up to 50% through intelligent context pruning. Auto-supersede duplicates, manually discard/distill content, and preserve critical state.
npm install @tuanhung303/opencode-acp



Reduce token usage by up to 50% through intelligent context management.
ACP optimizes LLM context windows by automatically pruning obsolete contentβtool outputs, messages, and reasoning blocksβwhile preserving critical operational state.
---
``mermaid
flowchart TB
subgraph Input["π₯ Input Layer"]
U[User Message]
T[Tool Outputs]
M[Assistant Messages]
R[Thinking Blocks]
end
subgraph Processing["βοΈ ACP Processing"]
direction TB
Auto["Auto-Supersede"]
Manual["Manual Pruning"]
subgraph AutoStrategies["Auto-Supersede Strategies"]
H[Hash-Based
Duplicates]
F[File-Based
Operations]
Todo[Todo-Based
Updates]
URL[Source-URL
Fetches]
SQ[State Query
Dedup]
end
subgraph ManualTools["Manual Tools"]
D[Discard]
Dist[Distill]
end
end
subgraph Output["π€ Optimized Context"]
Clean[Clean Context
~50% smaller]
L[LLM Provider]
end
U --> Processing
T --> Auto
M --> Manual
R --> Manual
Auto --> H
Auto --> F
Auto --> Todo
Auto --> URL
Auto --> SQ
Manual --> D
Manual --> Dist
H --> Clean
F --> Clean
Todo --> Clean
URL --> Clean
SQ --> Clean
D --> Clean
Dist --> Clean
Clean --> L
style Input fill:#e1f5fe,stroke:#01579b,stroke-width:2px
style Processing fill:#f3e5f5,stroke:#4a148c,stroke-width:2px
style AutoStrategies fill:#fff3e0,stroke:#e65100,stroke-width:1px
style ManualTools fill:#e8f5e9,stroke:#1b5e20,stroke-width:1px
style Output fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
`
---
`bash`
npm install @tuanhung303/opencode-acp
Add to your OpenCode config:
`jsonc`
// opencode.jsonc
{
"plugin": ["@tuanhung303/opencode-acp@latest"],
}
ACP handles most pruning automatically. The following tools give agents granular control over context:
`typescript
// Discard completed work
context_prune({ action: "discard", targets: [["a1b2c3"]] })
// Distill large outputs
context_prune({
action: "distill",
targets: [["d4e5f6", "Found 15 TypeScript files"]],
})
// Batch operations
context_prune({
action: "discard",
targets: [["hash1"], ["hash2"], ["hash3"]],
})
`
---
| Document | Purpose |
| --------------------------------------------------------------- | ------------------------------------------ |
| Validation Guide | 43 comprehensive test cases |
| Test Harness | Ready-to-run test scripts |
| Todo Write Testing Guide | Testing todowrite & stuck task detection |
| Context Architecture | Memory management strategies |
| Decision Tree | Visual pruning flowcharts |
| Limitations Report | What cannot be pruned |
| Changelog | Version history and migration guides |
---
ACP provides the context_prune tool for intelligent context management:
`typescript`
context_prune({
action: "discard" | "distill" | "replace",
targets: [string, string?, string?][] // Format depends on action
})
| Type | Format | Example |
| ------------------- | ------------------------- | ---------------------------------------------- |
| Tool outputs | 6 hex chars | 44136f, 01cb91 |abc123
| Thinking blocks | 6 hex chars | |def456
| Messages | 6 hex chars | |["Start marker:", "End marker.", "[pruned]"]
| Pattern replace | [start, end, replacement] | |
`typescript
// Prune multiple items at once
context_prune({
action: "discard",
targets: [
["44136f"], // Tool output
["abc123"], // Thinking block
["def456"], // Message
],
})
// Distill with shared summary
context_prune({
action: "distill",
targets: [
["44136f", "Research phase complete"],
["01cb91", "Research phase complete"],
],
})
// Pattern replace - replace content between markers
context_prune({
action: "replace",
targets: [
["Detailed findings from analysis:", "End of detailed findings.", "[analysis complete]"],
["Debug output started:", "Debug output ended.", "[debug pruned]"],
],
})
`
- Match content must be β₯30 characters
- Start OR end pattern must be >15 characters
- Literal matching only (no regex)
- Exactly one match per pattern
- No overlapping patterns
---
ACP automatically removes redundant content through multiple strategies:
Duplicate tool calls with identical arguments are automatically deduplicated.
``
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β BEFORE: β β AFTER: β
β β β β
β 1. read(package.json) #a1b2c3 β ββββΊ β ...other work... β
β 2. ...other work... β β 3. read(package.json) #d4e5f6ββββ β
β 3. read(package.json) #d4e5f6 β β β
β β β First call superseded (hash match) β
β Tokens: ~15,000 β β Tokens: ~10,000 (-33%) β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
File operations automatically supersede previous operations on the same file.
``
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β BEFORE: β β AFTER: β
β β β β
β 1. read(config.ts) β ββββΊ β β
β 2. write(config.ts) β β 3. edit(config.ts)ββββββββββββββ β
β 3. edit(config.ts) β β β
β β β Previous operations pruned β
β Tokens: ~18,000 β β Tokens: ~6,000 (-67%) β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
Todo operations automatically supersede previous todo states.
``
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
β BEFORE: β β AFTER: β
β β β β
β 1. todowrite: pending β ββββΊ β β
β 2. todowrite: in_progress β β 3. todowrite: completedββββββββββ β
β 3. todowrite: completed β β β
β β β Previous states auto-pruned β
β Tokens: ~4,500 β β Tokens: ~1,500 (-67%) β
βββββββββββββββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββββββ
Identical URL fetches are deduplicatedβonly the latest response is retained.
State queries (ls, find, pwd, git status) are deduplicatedβonly the latest results matter.
New context_prune tool calls supersede previous context operations, preventing context management overhead from accumulating.
Only the latest snapshot per file is retained. Previous snapshots are automatically pruned.
Failed tool attempts are automatically removed when the operation succeeds on retry.
---
These tools are exempt from pruning to ensure operational continuity:
``
context_info, task, todowrite, todoread, context_prune, batch, write, edit, plan_enter, plan_exit
Additional tools can be protected via configuration:
`jsonc`
{
"commands": {
"protectedTools": ["my_custom_tool"],
},
}
---
ACP uses its own config file with multiple levels:
``
Priority: Defaults β Global β Config Dir β Project
- Global: ~/.config/opencode/acp.jsonc$OPENCODE_CONFIG_DIR/acp.jsonc
- Config Dir: .opencode/acp.jsonc
- Project:
`jsonc
{
"$schema": "https://raw.githubusercontent.com/tuanhung303/opencode-agent-context-pruning/master/acp.schema.json",
"enabled": true,
"debug": false,
"pruneNotification": "minimal",
"commands": {
"enabled": true,
"protectedTools": [], // Additional tools to protect (merged with defaults)
},
"protectedFilePatterns": [
"**/.env",
"*/.env.",
"**/credentials.json",
"**/secrets.json",
"*/.pem",
"*/.key",
"**/package.json",
"**/tsconfig.json",
"**/pyproject.toml",
"**/Cargo.toml",
],
"tools": {
"settings": {
"protectedTools": [], // Merged with built-in protected tools
"enableAssistantMessagePruning": true,
"enableReasoningPruning": true,
"enableVisibleAssistantHashes": true,
},
"discard": { "enabled": true },
"distill": { "enabled": true, "showDistillation": false },
"todoReminder": {
"enabled": true,
"initialTurns": 5,
"repeatTurns": 4,
"stuckTaskTurns": 12,
},
"automataMode": { "enabled": true, "initialTurns": 8 },
},
"strategies": {
"purgeErrors": { "enabled": false, "turns": 4 },
"aggressivePruning": {
// All enabled by default - see Aggressive Pruning section
},
},
}
`
All aggressive pruning options are enabled by default for up to 50% token savings.
#### Pruning Presets
Use presets for quick configuration:
`jsonc`
{
"strategies": {
"aggressivePruning": {
"preset": "balanced", // Options: "compact", "balanced", "verbose"
},
},
}
| Preset | Description | Use Case |
| ------------ | ----------------------------------------- | -------------------------------- |
| compact | Maximum cleanup, all options enabled | Long sessions, token-constrained |
| balanced | Good defaults, preserves user code blocks | Most use cases (default) |
| verbose | Minimal cleanup, preserves everything | Debugging, audit trails |
#### Individual Options
Override preset values with individual flags:
`jsonc`
{
"strategies": {
"aggressivePruning": {
"preset": "balanced",
"pruneToolInputs": true, // Strip verbose inputs on supersede
"pruneStepMarkers": true, // Remove step markers entirely
"pruneSourceUrls": true, // Dedup URL fetches
"pruneFiles": true, // Mask file attachments
"pruneSnapshots": true, // Keep only latest snapshot
"pruneRetryParts": true, // Prune failed retries on success
"pruneUserCodeBlocks": false, // Keep user code blocks (balanced default)
"truncateOldErrors": false, // Keep full errors (balanced default)
"aggressiveFilePrune": true, // One-file-one-view
"stateQuerySupersede": true, // Dedup state queries (ls, git status)
},
},
}
---
| Metric | Without ACP | With ACP | Savings |
| ------------------- | ------------ | ----------- | ------- |
| Typical Session | ~80k tokens | ~40k tokens | 50% |
| Long Session | ~150k tokens | ~75k tokens | 50% |
| File-Heavy Work | ~100k tokens | ~35k tokens | 65% |
Cache Impact: ~65% cache hit rate with ACP vs ~85% without. The token savings typically outweigh the cache miss cost, especially in long sessions.
---
Run the comprehensive test suite:
`bashLoad test todos
todowrite({ / copy from docs/VALIDATION_GUIDE.md / })
See Validation Guide for detailed test procedures.
---
π Pruning Workflow
Complete example: execute tool β find hash β prune.
Step 1: Run a tool
`typescript
read({ filePath: "src/config.ts" })
// Output includes: a1b2c3
`Step 2: Find the hash in output
`
... file contents ...
a1b2c3
`Step 3: Prune when no longer needed
`typescript
context_prune({ action: "discard", targets: [["a1b2c3"]] })
// Response: γ ποΈ discard β γ- βοΈ read
// Available: Tools(5), Messages(2), Reasoning(1)
`Batch multiple targets:
`typescript
context_prune({ action: "discard", targets: [["a1b2c3"], ["d4e5f6"], ["g7h8i9"]] })
`Distill with summary:
`typescript
context_prune({
action: "distill",
targets: [["abc123", "Auth: chose JWT over sessions"]],
})
`---
ποΈ Architecture Overview
`mermaid
flowchart TD
subgraph OpenCode["OpenCode Core"]
direction TB
A[User Message] --> B[Session]
B --> C[Transform Hook]
C --> D[toModelMessages]
D --> E[LLM Provider]
end subgraph ACP["ACP Plugin"]
direction TB
C --> F[syncToolCache]
F --> G[injectHashes]
G --> H[Apply Strategies]
H --> I[prune]
I --> C
end
style OpenCode fill:#F4F7F9,stroke:#5A6B8A,stroke-width:1.5px
style ACP fill:#E8F5F2,stroke:#9AC4C0,stroke-width:1.5px
`ACP hooks into OpenCode's message flow to reduce context size before sending to the LLM:
1. Sync Tool Cache - Updates internal tool state tracking
2. Inject Hashes - Makes content addressable for pruning
3. Apply Strategies - Runs auto-supersede mechanisms
4. Prune - Applies manual and automatic pruning rules
---
π Commands
| Command | Description |
| ------------ | ------------------------------- |
|
/acp | Show ACP statistics and version |
| /acp stats | Show ACP statistics and version |---
π§ Advanced Features
$3
Monitors
todowrite usage and prompts when tasks are neglected:`jsonc
{
"tools": {
"todoReminder": {
"enabled": true,
"initialTurns": 8, // First reminder after 8 turns without todo update
"repeatTurns": 4, // Subsequent reminders every 4 turns
"stuckTaskTurns": 12, // Threshold for stuck task detection
},
},
}
`Reminder Behavior:
- First reminder: Fires after
initialTurns (8) turns without todowrite
- Repeat reminders: Fire every repeatTurns (4) turns thereafter
- Auto-reset: Each todowrite call resets the counter to 0
- Deduplication: Only ONE reminder exists in context at a time; new reminders replace old ones
- Stuck task detection: Tasks in in_progress for stuckTaskTurns (12) are flagged with guidance
- Prunable outputs: Reminder displays a list of prunable tool outputs to help with cleanupReminder Sequence:
`
Turn 0: todowrite() called (resets counter)
Turn 8: π First reminder (if no todowrite since turn 0)
Turn 12: π Repeat reminder
Turn 16: π Repeat reminder
...
`$3
Autonomous reflection triggered by "automata" keyword:
`jsonc
{
"tools": {
"automataMode": {
"enabled": true,
"initialTurns": 8, // Turns before first reflection
},
},
}
`$3
Identifies tasks stuck in
in_progress for too long:`jsonc
{
"tools": {
"todoReminder": {
"stuckTaskTurns": 12, // Threshold for stuck detection
},
},
}
`---
π§ Limitations
- Subagents: ACP is disabled for subagent sessions
- Cache Invalidation: Pruning mid-conversation invalidates prompt caches
- Protected Tools: Some tools cannot be pruned by design
---
π οΈ Troubleshooting
$3
Cause: Using Anthropic/DeepSeek/Kimi thinking mode with an outdated ACP version or missing reasoning sync.
Fix:
1. Update to ACP v3.0.0+:
npm install @tuanhung303/opencode-acp@latest
2. Ensure your config has thinking-compatible settings
3. See Thinking Mode Compatibility for details$3
Symptoms: Commands like
/acp return "Unknown command"Fix:
1. Verify plugin is in
opencode.jsonc: "plugin": ["@tuanhung303/opencode-acp@latest"]
2. Run npm run build && npm link in the plugin directory
3. Restart OpenCode$3
Check:
- Is aggressive pruning enabled in config? See Configuration
- Are you using protected tools excessively? (
task, write, edit can't be pruned)
- Is your session >100 turns? Consider starting a fresh session---
π¬ Provider Compatibility
$3
ACP is fully compatible with extended thinking mode APIs that require the
reasoning_content field. The context_prune tool automatically syncs reasoning content to prevent 400 Bad Request errors.Supported providers: Anthropic, DeepSeek, Kimi
Not required: OpenAI, Google
See the detailed technical documentation for implementation details and the root cause of the original compatibility issue.
---
π¦ npm Package
Package:
@tuanhung303/opencode-acp
License: MIT
Repository: https://github.com/tuanhung303/opencode-agent-context-pruning$3
`bash
Via npm
npm install @tuanhung303/opencode-acpVia OpenCode config
Add to opencode.jsonc: "plugin": ["@tuanhung303/opencode-acp@latest"]
Via URL (for agents)
curl -s https://raw.githubusercontent.com/tuanhung303/opencode-acp/master/README.md
`$3
- CI: Every PR triggers linting, type checking, and unit tests
- CD: Merges to
main auto-publish to npm---
π€ Contributing
1. Fork the repository
2. Create a feature branch
3. Run tests:
npm test
4. Submit a pull request---
π License
MIT Β© tuanhung303
---
β οΈ Known Pitfalls for Agents β Critical rules when modifying ACP code
> Read this section before modifying ACP code. These are hard-won lessons from debugging production issues.
$3
β WRONG:
`typescript
async function executeContextToolDiscard(ctx, toolCtx, hashes) {
const { state, logger } = ctx // Validate hashes...
if (validHashes.length === 0) {
// Early return without fetching messages
const currentParams = getCurrentParams(state, [], logger) // β BUG: Empty array
return "No valid hashes"
}
// Only fetch messages in success path
const messages = await client.session.messages(...)
}
`β
CORRECT:
`typescript
async function executeContextToolDiscard(ctx, toolCtx, hashes) {
const { client, state, logger } = ctx // ALWAYS fetch messages first - required for thinking mode API compatibility
const messagesResponse = await client.session.messages({
path: { id: toolCtx.sessionID },
})
const messages = messagesResponse.data || messagesResponse
// ALWAYS initialize session - syncs reasoning_content
await ensureSessionInitialized(client, state, toolCtx.sessionID, logger, messages)
// Now validate hashes...
if (validHashes.length === 0) {
const currentParams = getCurrentParams(state, messages, logger) // β Use actual messages
return "No valid hashes"
}
}
`Why? Anthropic's thinking mode API requires
reasoning_content on all assistant messages with tool calls. Skipping ensureSessionInitialized causes 400 errors.---
$3
This function syncs
reasoning_content from message parts to msg.info. Without it:`
error, status code: 400, message: thinking is enabled but reasoning_content is missing
in assistant tool call message at index 2
`Rule: Call
ensureSessionInitialized at the START of every context_prune tool function, before any early returns.---
$3
β WRONG:
`typescript
// Completely removing reasoning_content breaks API
state.prune.reasoningPartIds.push(partId)
// No replacement content β field removed β API error
`β
CORRECT:
`typescript
// Convert discard to distill with minimal placeholder
if (reasoningHashes.length > 0) {
const minimalSummaries = reasoningHashes.map(() => "β")
await executeContextReasoningDistill(
ctx,
toolCtx,
reasoningHashes.map((h, i) => [h, minimalSummaries[i]]),
)
}
`Why? The
reasoning_content field must exist (even if minimal) for thinking mode providers.---
$3
Always test context_prune tool changes with:
1. Valid existing hashes (success path)
2. Non-existing hashes like
"zzzzzz" (no-op path)
3. Mix of valid and invalid hashesThe no-op path is where most bugs hide because it's less tested.
---
$3
| Provider | Thinking Mode |
reasoning_content Required | Notes |
| --------- | ----------------- | ---------------------------- | -------------------- |
| Anthropic | Extended thinking | β
Yes | Strict validation |
| DeepSeek | DeepThink | β
Yes | Similar to Anthropic |
| Kimi | K1 thinking | β
Yes | Similar to Anthropic |
| OpenAI | N/A | β No | No thinking mode |
| Google | N/A | β No | No thinking mode |Always test with Anthropic when modifying context/pruning logicβit has the strictest API validation.
---
$3
`
User Message β Transform Hook β ensureReasoningContentSync β API Call
β
Context tool runs HERE
`If context_prune tool returns early (no-op), the transform hook may not have synced
reasoning_content yet. That's why each tool function must call ensureSessionInitialized itself.---
$3
When you see
reasoning_content is missing errors:1. Check the error index - "at index 2" means the 3rd message in the array
2. Verify message fetch - Is the function fetching messages before early returns?
3. Verify session init - Is
ensureSessionInitialized called in ALL code paths?
4. Check the no-op path - Does the early return path have proper initialization?
5. Test with invalid hash - context_prune({ action: "discard", targets: [["zzzzzz"]] })---
π¨οΈ For LLM Agents β Quick pruning reference and safety guidelines
If you're an AI assistant reading this:
> Quick Pruning Reference:
>
> - Light pressure (<50%): No action
> - Moderate (50-75%): Prune old tool outputs using
context_prune({ action: "discard", targets: [[hash]] })`Ask the user: "Would you like me to run a context optimization check?"