ccprune

Next time your Claude Code context is running low, just quit cc then run npx ccprune - it auto-resumes you back into your last thread, now compacted with an intelligent rolling summary. Run it again when you're low again - the summaries stack, so context just keeps rolling forward.

> Fork of claude-prune with enhanced features: percentage-based pruning, AI summarization enabled by default, and improved UX.

Features

- Zero-Config Default: Just run ccprune - auto-detects latest session, keeps 55K tokens (~70K after Claude Code adds system context)
- Token-Based Pruning: Prunes based on actual token count, not message count
- Smart Threshold: Automatically skips pruning if session is under 55K tokens
- AI Summarization: Automatically generates a summary of pruned content (enabled by default)
- Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary
- Small Session Warning: Prompts for confirmation when auto-selecting sessions with < 5 messages
- Safe by Default: Always preserves session summaries and metadata
- Auto Backup: Creates timestamped backups before modifying files
- Restore Support: Easily restore from backups with the restore command
- Dry-Run Preview: Preview changes and summary before committing

Installation

$3

``bash

`Using npx (Node.js)`


npx ccprune
Using bunx (Bun)

bunx ccprune

$3

`bash

`Using npm`


npm install -g ccprune
Using bun

bun install -g ccprune


Quick Start

1. Quit Claude Code - Press Ctrl+C or type /quit

2. Run prune from the same project directory:`bash npx ccprune`

That's it! ccprune auto-detects your latest session, prunes old messages (keeping a summary), and resumes automatically.

`Setup (Recommended)`

For fast, high-quality summarization, set up a Gemini API key:

1. Get a free key from Google AI Studio 2. Add to your shell profile (~/.zshrc or ~/.bashrc):`bash export GEMINI_API_KEY=your_key`3. Restart your terminal or runsource ~/.zshrc

With GEMINI_API_KEY set, ccprune automatically uses Gemini 2.5 Flash for fast summarization without chunking.

Note: If GEMINI_API_KEY is not set, ccprune automatically falls back to Claude Code CLI for summarization (no additional setup required).

`Usage`

`bash

`Zero-config: auto-detects latest session, keeps 55K tokens`


ccprune
Pick from available sessions interactively

ccprune --pick
Explicit session ID (if you need a specific session)

ccprune 
Explicit token limit

ccprune --keep 55000
ccprune --keep-tokens 80000
Subcommands

ccprune restore  [--dry-run]

$3

- sessionId: (Optional) UUID of the Claude Code session. Auto-detects latest if omitted.

`$3`

| Subcommand | Description | |------------|-------------| |restore | Restore a session from the latest backup | |restore --dry-run | Preview restore without making changes |

`$3`

| Option | Description | |--------|-------------| |--pick| Interactively select from available sessions | |-n, --no-resume| Skip automatic session resume | |--yolo | Resume with --dangerously-skip-permissions| |--resume-model | Model for resumed session (opus, sonnet, haiku, opusplan) | |-k, --keep | Number of tokens to retain (default: 55000) | |--keep-tokens | Number of tokens to retain (alias for -k) | |--dry-run| Preview changes and summary without modifying files | |--no-summary| Skip AI summarization of pruned messages | |--summary-model | Model for summarization (haiku, sonnet, or full name) | |--summary-timeout | Timeout for summarization in milliseconds (default: 360000) | |--gemini| Use Gemini 3 Pro for summarization | |--gemini-flash| Use Gemini 2.5 Flash for summarization | |--claude-code| Use Claude Code CLI for summarization (chunks large transcripts) | |--prune-tools| Replace all non-protected tool outputs with placeholders | |--prune-tools-ai| Use AI to identify which tool outputs to prune | |--prune-tools-dedup| Deduplicate identical tool calls, keep only most recent | |--prune-tools-max| Maximum savings: dedup + AI analysis combined | |--prune-tools-keep | Comma-separated tools to never prune (default: Edit,Write,TodoWrite,TodoRead,AskUserQuestion) | |-h, --help| Show help information | |-V, --version | Show version number |

If no session ID is provided, auto-detects the most recently modified session. If no keep option is specified, defaults to 55,000 tokens (~70K actual context after Claude Code adds system prompt and CLAUDE.md).

Summarization priority: 1.--claude-codeflag: Force Claude Code CLI (chunks transcripts >30K chars) 2.--gemini or --gemini-flashflags: Use Gemini API 3. Auto-detect: IfGEMINI_API_KEYis set, uses Gemini 2.5 Flash 4. Fallback: Claude Code CLI (no API key needed)

`$3`

`bash

`Simplest: auto-detect, prune, and resume automatically`


npx ccprune
Prune only (don't resume)

npx ccprune -n
Resume in yolo mode (--dangerously-skip-permissions)

npx ccprune --yolo
Resume with a specific model (e.g., Opus 4.5)

npx ccprune --resume-model opus
Combine yolo mode with Opus

npx ccprune --yolo --resume-model opus
Pick from available sessions interactively

npx ccprune --pick
Keep 55K tokens (default)

npx ccprune --keep 55000
Keep 80K tokens (less aggressive pruning)

npx ccprune --keep-tokens 80000
Preview what would be pruned (shows summary preview too)

npx ccprune --dry-run
Skip summarization for faster pruning

npx ccprune --no-summary
Use Claude Code CLI with haiku model (faster/cheaper)

npx ccprune --claude-code --summary-model haiku
Use Gemini 3 Pro for summarization

npx ccprune --gemini
Use Gemini 2.5 Flash (default when GEMINI_API_KEY is set)

npx ccprune --gemini-flash
Force Claude Code CLI for summarization

npx ccprune --claude-code
Target a specific session by ID

npx ccprune 03953bb8-6855-4e53-a987-e11422a03fc6 --keep 55000
Restore from the latest backup

npx ccprune restore 03953bb8-6855-4e53-a987-e11422a03fc6


Tool Output Pruning (Default)
Tool pruning runs automatically to reduce tokens before summarization:
1. Dedup: Identical tool calls are deduplicated (keeps only most recent)
2. AI analysis: Intelligently prunes irrelevant outputs using your summarization backend

`bash

`Default behavior (dedup + AI) - runs automatically`


ccprune
Disable automatic tool pruning

ccprune --skip-tool-pruning
Explicit modes for specific behavior:

ccprune --prune-tools          # Simple: replace ALL outputs (no AI)
ccprune --prune-tools-dedup    # Dedup only (no AI)
ccprune --prune-tools-ai       # AI only (no dedup)
ccprune --prune-tools-max      # Explicit dedup + AI (same as default)
Custom protected tools

ccprune --prune-tools-keep "Edit,Write,Bash"

Protected tools (never pruned by default): -Edit, Write- file modification context -TodoWrite, TodoRead- task tracking -AskUserQuestion - user interaction

Modes explained: - Default (no flags): Runs dedup first (free), then AI analysis - maximum savings - Simple (--prune-tools): Replaces all non-protected tool outputs with [Pruned: {tool} output - {bytes} bytes]- AI (--prune-tools-ai): Uses your summarization backend (Gemini or Claude Code CLI) to intelligently identify which outputs are no longer relevant - Dedup (--prune-tools-dedup): Keeps only the most recent output when the same tool is called with identical input. Annotates with [{total} total calls]- Skip (--skip-tool-pruning): Disable automatic tool pruning entirely

`How It Works`

`BEFORE AFTER FIRST PRUNE AFTER RE-PRUNE ────── ──────────────── ────────────── ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ msg 1 (old) │─┐ │ [SUMMARY] │─┐ │ [NEW SUMMARY] │ ◄─ synthesized │ msg 2 (old) │ │ │ "Previously.."│ │ │ (old+middle) │ │ ... │ ├──► ├───────────────┤ │ ├───────────────┤ │ msg N (old) │─┘ │ msg N+1 (kept)│ ├───────► │ msg X (kept) │ ├───────────────┤ │ msg N+2 (kept)│ │ │ msg Y (kept) │ │ msg N+1 (new) │─────►│ msg N+3 (kept)│─┘ │ msg Z (kept) │ │ msg N+2 (new) │ └───────────────┘ └───────────────┘ │ msg N+3 (new) │ └───────────────┘ ▲ ▲ │ │ old msgs become old summary + middle summary, recent kept synthesized, recent kept`

1. Locates Session File: Finds $CLAUDE_CONFIG_DIR/projects/{project-path}/{sessionId}.jsonl2. Counts Tokens: Uses Claude's cumulative usage data from the last message:input_tokens + cache_read_input_tokens + cache_creation_input_tokens. This matches Claude Code's UI display exactly 3. Early Exit: If total tokens ≤ threshold (55K default), skips pruning and auto-resumes 4. Preserves Critical Data: Always keeps the first line (file-history-snapshot or session metadata) 5. Token-Based Cutoff: Scans right-to-left, accumulating tokens until adding the next message would exceed the threshold 6. Content Extraction: Extracts text from messages, includingtool_result outputs and thinking blocks. Tool calls become [Used tool: ToolName]placeholders to provide context without verbose tool I/O 7. Orphan Cleanup: Removestool_result blocks in kept messages that reference tool_useblocks from pruned messages 8. AI Summarization: Generates a structured summary with sections: Overview, What Was Accomplished, Files Modified, Key Technical Details, Current State & Pending Work 9. Summary Synthesis: Re-pruning synthesizes old summary + new pruned content into one cohesive summary - Gemini (default with API key): Handles large transcripts natively without chunking - Claude Code CLI (fallback): May chunk transcripts >30K characters (see Claude Code CLI Summarization below) 10. Safe Backup: Creates timestamped backup inprune-backup/before modifying 11. Auto-Resume: Optionally resumes Claude Code session after pruning

`Claude Code CLI Summarization`

When using the --claude-code flag (or when GEMINI_API_KEY is not set), ccprune uses the Claude Code CLI for summarization with these specific behaviors:

Chunking for Large Transcripts: - Transcripts >30,000 characters are automatically split into chunks - Each chunk is summarized independently - Chunk summaries are then combined into a final unified summary - Why: Ensures reliable summarization even for very long sessions

Model Selection: - Default: Uses your Claude Code CLI default model - Override with--summary-model haiku or --summary-model sonnet- Supports full model names (e.g.,claude-3-5-sonnet-20241022)

Timeout & Retries: - Default timeout: 360 seconds (6 minutes) - Override with--summary-timeout - Automatic retries: Up to 2 attempts on failure

When to Use: - No API key required (uses existing Claude Code subscription) - Handles extremely large transcripts via chunking - Works offline (if Claude Code CLI works offline)

Trade-offs: - Slower than Gemini API (spawns subprocess) - Chunking may lose some context coherence for very large sessions - Requires Claude Code CLI to be installed and authenticated

`File Structure`

Claude Code stores sessions in:

`~/.claude/projects/{project-path-with-hyphens}/{sessionId}.jsonl`

For example, a project at /Users/alice/my-app becomes:

`~/.claude/projects/-Users-alice-my-app/{sessionId}.jsonl`

`Environment Variables`

`$3`

By default, ccprune looks for session files in ~/.claude. If Claude Code is configured to use a different directory, you can specify it with the CLAUDE_CONFIG_DIR environment variable:

`bash CLAUDE_CONFIG_DIR=/custom/path/to/claude ccprune`

`$3`

When set, ccprune automatically uses Gemini 2.5 Flash for summarization (recommended). Get your free API key from Google AI Studio.

`bash export GEMINI_API_KEY=your_api_key_here ccprune # automatically uses Gemini 2.5 Flash`

Use --gemini for Gemini 3 Pro, or --claude-code to force Claude Code CLI.

`Migrating from claude-prune`

If you were using the original claude-prune package, ccprune v3.x has these changes:

`bash

`claude-prune v1.x (message-count based, summary was opt-in)`


claude-prune  -k 10 --summarize-pruned
ccprune v2.x (percentage-based, summary enabled by default)

ccprune                     # defaults to 20% of messages
ccprune  --keep-percent 25  # keep latest 25% of messages
ccprune v3.x (token-based, summary enabled by default)

ccprune                     # defaults to 55K tokens
ccprune  -k 55000           # keep 55K tokens
ccprune  --keep-tokens 80000 # keep 80K tokens

Key changes in v3.x: - Token-based pruning:-know means tokens, not message count - Removed:-p, --keep-percentflag (replaced by token-based approach) - Auto-skip: Sessions under 55K tokens are not pruned - Lenient boundary: Includes one extra message at the boundary to preserve context - Summary is enabled by default (use--no-summaryto disable) - Re-pruning synthesizes old summary + new pruned content into one summary

Key changes in v4.x: - Accurate token counting: Uses Claude's cumulative usage data (input_tokens + cache_read + cache_creation) to match Claude Code UI - Proportional scaling: Per-message tokens are scaled to match total context for accurate pruning ---resume-model: Specify which model to use when auto-resuming (opus, sonnet, haiku, opusplan) - 55K default: Results in ~70K total context after Claude Code adds system prompt, CLAUDE.md, and other overhead

`Development`

`bash

`Clone and install`


git clone https://github.com/nicobailon/claude-prune.git
cd claude-prune
bun install
Run tests

bun run test
Build

bun run build
Test locally

./dist/index.js --help

Credits

This project is a fork of claude-prune by Danny Aziz. Thanks for the original implementation!

License

MIT

ccprune

> Fork of claude-prune with enhanced features: percentage-based pruning, AI summarization enabled by default, and improved UX.

Features

Installation

$3

``bash

`Using npx (Node.js)`


npx ccprune
Using bunx (Bun)

bunx ccprune

$3

`bash

`Using npm`


npm install -g ccprune
Using bun

bun install -g ccprune


Quick Start

1. Quit Claude Code - Press Ctrl+C or type /quit

2. Run prune from the same project directory:`bash npx ccprune`

That's it! ccprune auto-detects your latest session, prunes old messages (keeping a summary), and resumes automatically.

`Setup (Recommended)`

For fast, high-quality summarization, set up a Gemini API key:

1. Get a free key from Google AI Studio 2. Add to your shell profile (~/.zshrc or ~/.bashrc):`bash export GEMINI_API_KEY=your_key`3. Restart your terminal or runsource ~/.zshrc

With GEMINI_API_KEY set, ccprune automatically uses Gemini 2.5 Flash for fast summarization without chunking.

Note: If GEMINI_API_KEY is not set, ccprune automatically falls back to Claude Code CLI for summarization (no additional setup required).

`Usage`

`bash

`Zero-config: auto-detects latest session, keeps 55K tokens`


ccprune
Pick from available sessions interactively

ccprune --pick
Explicit session ID (if you need a specific session)

ccprune 
Explicit token limit

ccprune --keep 55000
ccprune --keep-tokens 80000
Subcommands

ccprune restore  [--dry-run]

$3

- sessionId: (Optional) UUID of the Claude Code session. Auto-detects latest if omitted.

`$3`

| Subcommand | Description | |------------|-------------| |restore | Restore a session from the latest backup | |restore --dry-run | Preview restore without making changes |

`$3`

`bash

`Simplest: auto-detect, prune, and resume automatically`


npx ccprune
Prune only (don't resume)

npx ccprune -n
Resume in yolo mode (--dangerously-skip-permissions)

npx ccprune --yolo
Resume with a specific model (e.g., Opus 4.5)

npx ccprune --resume-model opus
Combine yolo mode with Opus

npx ccprune --yolo --resume-model opus
Pick from available sessions interactively

npx ccprune --pick
Keep 55K tokens (default)

npx ccprune --keep 55000
Keep 80K tokens (less aggressive pruning)

npx ccprune --keep-tokens 80000
Preview what would be pruned (shows summary preview too)

npx ccprune --dry-run
Skip summarization for faster pruning

npx ccprune --no-summary
Use Claude Code CLI with haiku model (faster/cheaper)

npx ccprune --claude-code --summary-model haiku
Use Gemini 3 Pro for summarization

npx ccprune --gemini
Use Gemini 2.5 Flash (default when GEMINI_API_KEY is set)

npx ccprune --gemini-flash
Force Claude Code CLI for summarization

npx ccprune --claude-code
Target a specific session by ID

npx ccprune 03953bb8-6855-4e53-a987-e11422a03fc6 --keep 55000
Restore from the latest backup

npx ccprune restore 03953bb8-6855-4e53-a987-e11422a03fc6


Tool Output Pruning (Default)
Tool pruning runs automatically to reduce tokens before summarization:
1. Dedup: Identical tool calls are deduplicated (keeps only most recent)
2. AI analysis: Intelligently prunes irrelevant outputs using your summarization backend

`bash

`Default behavior (dedup + AI) - runs automatically`


ccprune
Disable automatic tool pruning

ccprune --skip-tool-pruning
Explicit modes for specific behavior:

ccprune --prune-tools          # Simple: replace ALL outputs (no AI)
ccprune --prune-tools-dedup    # Dedup only (no AI)
ccprune --prune-tools-ai       # AI only (no dedup)
ccprune --prune-tools-max      # Explicit dedup + AI (same as default)
Custom protected tools

ccprune --prune-tools-keep "Edit,Write,Bash"

Protected tools (never pruned by default): -Edit, Write- file modification context -TodoWrite, TodoRead- task tracking -AskUserQuestion - user interaction

`How It Works`

`Claude Code CLI Summarization`

When using the --claude-code flag (or when GEMINI_API_KEY is not set), ccprune uses the Claude Code CLI for summarization with these specific behaviors:

Model Selection: - Default: Uses your Claude Code CLI default model - Override with--summary-model haiku or --summary-model sonnet- Supports full model names (e.g.,claude-3-5-sonnet-20241022)

Timeout & Retries: - Default timeout: 360 seconds (6 minutes) - Override with--summary-timeout - Automatic retries: Up to 2 attempts on failure

When to Use: - No API key required (uses existing Claude Code subscription) - Handles extremely large transcripts via chunking - Works offline (if Claude Code CLI works offline)

Trade-offs: - Slower than Gemini API (spawns subprocess) - Chunking may lose some context coherence for very large sessions - Requires Claude Code CLI to be installed and authenticated

`File Structure`

Claude Code stores sessions in:

`~/.claude/projects/{project-path-with-hyphens}/{sessionId}.jsonl`

For example, a project at /Users/alice/my-app becomes:

`~/.claude/projects/-Users-alice-my-app/{sessionId}.jsonl`

`Environment Variables`

`$3`

By default, ccprune looks for session files in ~/.claude. If Claude Code is configured to use a different directory, you can specify it with the CLAUDE_CONFIG_DIR environment variable:

`bash CLAUDE_CONFIG_DIR=/custom/path/to/claude ccprune`

`$3`

When set, ccprune automatically uses Gemini 2.5 Flash for summarization (recommended). Get your free API key from Google AI Studio.

`bash export GEMINI_API_KEY=your_api_key_here ccprune # automatically uses Gemini 2.5 Flash`

Use --gemini for Gemini 3 Pro, or --claude-code to force Claude Code CLI.

`Migrating from claude-prune`

If you were using the original claude-prune package, ccprune v3.x has these changes:

`bash

`claude-prune v1.x (message-count based, summary was opt-in)`


claude-prune  -k 10 --summarize-pruned
ccprune v2.x (percentage-based, summary enabled by default)

ccprune                     # defaults to 20% of messages
ccprune  --keep-percent 25  # keep latest 25% of messages
ccprune v3.x (token-based, summary enabled by default)

ccprune                     # defaults to 55K tokens
ccprune  -k 55000           # keep 55K tokens
ccprune  --keep-tokens 80000 # keep 80K tokens

`Development`

`bash

`Clone and install`


git clone https://github.com/nicobailon/claude-prune.git
cd claude-prune
bun install
Run tests

bun run test
Build

bun run build
Test locally

./dist/index.js --help

Credits

This project is a fork of claude-prune by Danny Aziz. Thanks for the original implementation!

License

MIT