Prune early messages from Claude Code session transcripts with AI summarization
npm install ccpruneNext time your Claude Code context is running low, just quit cc then run npx ccprune - it auto-resumes you back into your last thread, now compacted with an intelligent rolling summary. Run it again when you're low again - the summaries stack, so context just keeps rolling forward.
> Fork of claude-prune with enhanced features: percentage-based pruning, AI summarization enabled by default, and improved UX.
- Zero-Config Default: Just run ccprune - auto-detects latest session, keeps 55K tokens (~70K after Claude Code adds system context)
- Token-Based Pruning: Prunes based on actual token count, not message count
- Smart Threshold: Automatically skips pruning if session is under 55K tokens
- AI Summarization: Automatically generates a summary of pruned content (enabled by default)
- Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary
- Small Session Warning: Prompts for confirmation when auto-selecting sessions with < 5 messages
- Safe by Default: Always preserves session summaries and metadata
- Auto Backup: Creates timestamped backups before modifying files
- Restore Support: Easily restore from backups with the restore command
- Dry-Run Preview: Preview changes and summary before committing
``bashUsing npx (Node.js)
npx ccprune
$3
`bash
Using npm
npm install -g ccpruneUsing bun
bun install -g ccprune
`Quick Start
1. Quit Claude Code - Press
Ctrl+C or type /quit2. Run prune from the same project directory:
`bash
npx ccprune
`That's it! ccprune auto-detects your latest session, prunes old messages (keeping a summary), and resumes automatically.
Setup (Recommended)
For fast, high-quality summarization, set up a Gemini API key:
1. Get a free key from Google AI Studio
2. Add to your shell profile (~/.zshrc or ~/.bashrc):
`bash
export GEMINI_API_KEY=your_key
`
3. Restart your terminal or run source ~/.zshrcWith
GEMINI_API_KEY set, ccprune automatically uses Gemini 2.5 Flash for fast summarization without chunking.Note: If
GEMINI_API_KEY is not set, ccprune automatically falls back to Claude Code CLI for summarization (no additional setup required).Usage
`bash
Zero-config: auto-detects latest session, keeps 55K tokens
ccprunePick from available sessions interactively
ccprune --pickExplicit session ID (if you need a specific session)
ccprune Explicit token limit
ccprune --keep 55000
ccprune --keep-tokens 80000Subcommands
ccprune restore [--dry-run]
`$3
-
sessionId: (Optional) UUID of the Claude Code session. Auto-detects latest if omitted.$3
| Subcommand | Description |
|------------|-------------|
|
restore | Restore a session from the latest backup |
| restore | Preview restore without making changes |$3
| Option | Description |
|--------|-------------|
|
--pick | Interactively select from available sessions |
| -n, --no-resume | Skip automatic session resume |
| --yolo | Resume with --dangerously-skip-permissions |
| --resume-model | Model for resumed session (opus, sonnet, haiku, opusplan) |
| -k, --keep | Number of tokens to retain (default: 55000) |
| --keep-tokens | Number of tokens to retain (alias for -k) |
| --dry-run | Preview changes and summary without modifying files |
| --no-summary | Skip AI summarization of pruned messages |
| --summary-model | Model for summarization (haiku, sonnet, or full name) |
| --summary-timeout | Timeout for summarization in milliseconds (default: 360000) |
| --gemini | Use Gemini 3 Pro for summarization |
| --gemini-flash | Use Gemini 2.5 Flash for summarization |
| --claude-code | Use Claude Code CLI for summarization (chunks large transcripts) |
| --prune-tools | Replace all non-protected tool outputs with placeholders |
| --prune-tools-ai | Use AI to identify which tool outputs to prune |
| --prune-tools-dedup | Deduplicate identical tool calls, keep only most recent |
| --prune-tools-max | Maximum savings: dedup + AI analysis combined |
| --prune-tools-keep | Comma-separated tools to never prune (default: Edit,Write,TodoWrite,TodoRead,AskUserQuestion) |
| -h, --help | Show help information |
| -V, --version | Show version number |If no session ID is provided, auto-detects the most recently modified session. If no keep option is specified, defaults to 55,000 tokens (~70K actual context after Claude Code adds system prompt and CLAUDE.md).
Summarization priority:
1.
--claude-code flag: Force Claude Code CLI (chunks transcripts >30K chars)
2. --gemini or --gemini-flash flags: Use Gemini API
3. Auto-detect: If GEMINI_API_KEY is set, uses Gemini 2.5 Flash
4. Fallback: Claude Code CLI (no API key needed)$3
`bash
Simplest: auto-detect, prune, and resume automatically
npx ccprunePrune only (don't resume)
npx ccprune -nResume in yolo mode (--dangerously-skip-permissions)
npx ccprune --yoloResume with a specific model (e.g., Opus 4.5)
npx ccprune --resume-model opusCombine yolo mode with Opus
npx ccprune --yolo --resume-model opusPick from available sessions interactively
npx ccprune --pickKeep 55K tokens (default)
npx ccprune --keep 55000Keep 80K tokens (less aggressive pruning)
npx ccprune --keep-tokens 80000Preview what would be pruned (shows summary preview too)
npx ccprune --dry-runSkip summarization for faster pruning
npx ccprune --no-summaryUse Claude Code CLI with haiku model (faster/cheaper)
npx ccprune --claude-code --summary-model haikuUse Gemini 3 Pro for summarization
npx ccprune --geminiUse Gemini 2.5 Flash (default when GEMINI_API_KEY is set)
npx ccprune --gemini-flashForce Claude Code CLI for summarization
npx ccprune --claude-codeTarget a specific session by ID
npx ccprune 03953bb8-6855-4e53-a987-e11422a03fc6 --keep 55000Restore from the latest backup
npx ccprune restore 03953bb8-6855-4e53-a987-e11422a03fc6
`Tool Output Pruning (Default)
Tool pruning runs automatically to reduce tokens before summarization:
1. Dedup: Identical tool calls are deduplicated (keeps only most recent)
2. AI analysis: Intelligently prunes irrelevant outputs using your summarization backend
`bash
Default behavior (dedup + AI) - runs automatically
ccpruneDisable automatic tool pruning
ccprune --skip-tool-pruningExplicit modes for specific behavior:
ccprune --prune-tools # Simple: replace ALL outputs (no AI)
ccprune --prune-tools-dedup # Dedup only (no AI)
ccprune --prune-tools-ai # AI only (no dedup)
ccprune --prune-tools-max # Explicit dedup + AI (same as default)Custom protected tools
ccprune --prune-tools-keep "Edit,Write,Bash"
`Protected tools (never pruned by default):
-
Edit, Write - file modification context
- TodoWrite, TodoRead - task tracking
- AskUserQuestion - user interactionModes explained:
- Default (no flags): Runs dedup first (free), then AI analysis - maximum savings
- Simple (
--prune-tools): Replaces all non-protected tool outputs with [Pruned: {tool} output - {bytes} bytes]
- AI (--prune-tools-ai): Uses your summarization backend (Gemini or Claude Code CLI) to intelligently identify which outputs are no longer relevant
- Dedup (--prune-tools-dedup): Keeps only the most recent output when the same tool is called with identical input. Annotates with [{total} total calls]
- Skip (--skip-tool-pruning): Disable automatic tool pruning entirelyHow It Works
`
BEFORE AFTER FIRST PRUNE AFTER RE-PRUNE
────── ──────────────── ──────────────
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ msg 1 (old) │─┐ │ [SUMMARY] │─┐ │ [NEW SUMMARY] │ ◄─ synthesized
│ msg 2 (old) │ │ │ "Previously.."│ │ │ (old+middle) │
│ ... │ ├──► ├───────────────┤ │ ├───────────────┤
│ msg N (old) │─┘ │ msg N+1 (kept)│ ├───────► │ msg X (kept) │
├───────────────┤ │ msg N+2 (kept)│ │ │ msg Y (kept) │
│ msg N+1 (new) │─────►│ msg N+3 (kept)│─┘ │ msg Z (kept) │
│ msg N+2 (new) │ └───────────────┘ └───────────────┘
│ msg N+3 (new) │
└───────────────┘ ▲ ▲
│ │
old msgs become old summary + middle
summary, recent kept synthesized, recent kept
`1. Locates Session File: Finds
$CLAUDE_CONFIG_DIR/projects/{project-path}/{sessionId}.jsonl
2. Counts Tokens: Uses Claude's cumulative usage data from the last message: input_tokens + cache_read_input_tokens + cache_creation_input_tokens. This matches Claude Code's UI display exactly
3. Early Exit: If total tokens ≤ threshold (55K default), skips pruning and auto-resumes
4. Preserves Critical Data: Always keeps the first line (file-history-snapshot or session metadata)
5. Token-Based Cutoff: Scans right-to-left, accumulating tokens until adding the next message would exceed the threshold
6. Content Extraction: Extracts text from messages, including tool_result outputs and thinking blocks. Tool calls become [Used tool: ToolName] placeholders to provide context without verbose tool I/O
7. Orphan Cleanup: Removes tool_result blocks in kept messages that reference tool_use blocks from pruned messages
8. AI Summarization: Generates a structured summary with sections: Overview, What Was Accomplished, Files Modified, Key Technical Details, Current State & Pending Work
9. Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary
- Gemini (default with API key): Handles large transcripts natively without chunking
- Claude Code CLI (fallback): May chunk transcripts >30K characters (see Claude Code CLI Summarization below)
10. Safe Backup: Creates timestamped backup in prune-backup/ before modifying
11. Auto-Resume: Optionally resumes Claude Code session after pruningClaude Code CLI Summarization
When using the
--claude-code flag (or when GEMINI_API_KEY is not set), ccprune uses the Claude Code CLI for summarization with these specific behaviors:Chunking for Large Transcripts:
- Transcripts >30,000 characters are automatically split into chunks
- Each chunk is summarized independently
- Chunk summaries are then combined into a final unified summary
- Why: Ensures reliable summarization even for very long sessions
Model Selection:
- Default: Uses your Claude Code CLI default model
- Override with
--summary-model haiku or --summary-model sonnet
- Supports full model names (e.g., claude-3-5-sonnet-20241022)Timeout & Retries:
- Default timeout: 360 seconds (6 minutes)
- Override with
--summary-timeout
- Automatic retries: Up to 2 attempts on failureWhen to Use:
- No API key required (uses existing Claude Code subscription)
- Handles extremely large transcripts via chunking
- Works offline (if Claude Code CLI works offline)
Trade-offs:
- Slower than Gemini API (spawns subprocess)
- Chunking may lose some context coherence for very large sessions
- Requires Claude Code CLI to be installed and authenticated
File Structure
Claude Code stores sessions in:
`
~/.claude/projects/{project-path-with-hyphens}/{sessionId}.jsonl
`For example, a project at
/Users/alice/my-app becomes:`
~/.claude/projects/-Users-alice-my-app/{sessionId}.jsonl
`Environment Variables
$3
By default, ccprune looks for session files in
~/.claude. If Claude Code is configured to use a different directory, you can specify it with the CLAUDE_CONFIG_DIR environment variable:`bash
CLAUDE_CONFIG_DIR=/custom/path/to/claude ccprune
`$3
When set, ccprune automatically uses Gemini 2.5 Flash for summarization (recommended). Get your free API key from Google AI Studio.
`bash
export GEMINI_API_KEY=your_api_key_here
ccprune # automatically uses Gemini 2.5 Flash
`Use
--gemini for Gemini 3 Pro, or --claude-code to force Claude Code CLI.Migrating from claude-prune
If you were using the original
claude-prune package, ccprune v3.x has these changes:`bash
claude-prune v1.x (message-count based, summary was opt-in)
claude-prune -k 10 --summarize-prunedccprune v2.x (percentage-based, summary enabled by default)
ccprune # defaults to 20% of messages
ccprune --keep-percent 25 # keep latest 25% of messagesccprune v3.x (token-based, summary enabled by default)
ccprune # defaults to 55K tokens
ccprune -k 55000 # keep 55K tokens
ccprune --keep-tokens 80000 # keep 80K tokens
`Key changes in v3.x:
- Token-based pruning:
-k now means tokens, not message count
- Removed: -p, --keep-percent flag (replaced by token-based approach)
- Auto-skip: Sessions under 55K tokens are not pruned
- Lenient boundary: Includes one extra message at the boundary to preserve context
- Summary is enabled by default (use --no-summary to disable)
- Re-pruning synthesizes old summary + new pruned content into one summaryKey changes in v4.x:
- Accurate token counting: Uses Claude's cumulative usage data (
input_tokens + cache_read + cache_creation) to match Claude Code UI
- Proportional scaling: Per-message tokens are scaled to match total context for accurate pruning
- --resume-model: Specify which model to use when auto-resuming (opus, sonnet, haiku, opusplan)
- 55K default: Results in ~70K total context after Claude Code adds system prompt, CLAUDE.md, and other overheadDevelopment
`bash
Clone and install
git clone https://github.com/nicobailon/claude-prune.git
cd claude-prune
bun installRun tests
bun run testBuild
bun run buildTest locally
./dist/index.js --help
``This project is a fork of claude-prune by Danny Aziz. Thanks for the original implementation!
MIT