Comprehensive benchmarking and performance analysis tools for Claude Code versions
npm install @clawdcc/cvm-benchmarkComprehensive benchmarking and performance analysis tools for Claude Code versions managed by CVM.
- --version Spawn Benchmarks: Measure Claude startup time using --version flag
- Interactive PTY Benchmarks: Measure full interactive startup time with terminal signals
- Multi-Run Comparison: Compare multiple benchmark runs to verify consistency
- HTML Reports: Beautiful Chart.js-powered performance visualization
- Session Cleanup: Automatic cleanup of test sessions for fair benchmarking
- Viability Detection: Identify minimum viable Claude Code versions
``bashInstall via NPM
npm install -g @clawd/cvm-benchmark
$3
`bash
Clone into a local directory
git clone https://github.com/clawd/cvm-benchmark.git
cd cvm-benchmark
npm installLink to CVM
ln -s $(pwd)/index.js ~/.cvm/plugins/benchmark.jsVerify
cvm plugins
`Usage
$3
`bash
Benchmark a specific version (interactive startup test)
cvm benchmark 2.0.42Benchmark all installed versions
cvm benchmark --allCompare multiple benchmark runs
cvm benchmark --compare 1 2Check loaded plugins
cvm plugins
`$3
`bash
Run comprehensive benchmark suite
node index.js allCompare benchmark runs
node index.js compare 1 2Run specific benchmarks
node lib/benchmark-version.js
node lib/benchmark-interactive.js 2.0.42
node lib/benchmark-interactive-all.js
node lib/comprehensive-suite.js 3
`Benchmark Types
$3
- Spawns Claude with --version flag
- Measures process spawn and execution time
- Fast, simple performance indicator
- Ideal for quick version comparison$3
- Spawns Claude in pseudo-terminal (PTY)
- Detects ready state via terminal signals:
- ESC[?2004h - Bracketed paste mode
- ESC[?1004h - Focus events
- > - Prompt character
- No timeout-based detection (signal-based only)
- Measures real interactive startup time
- Cleans up session files after each runTrust Prompt Handling:
- Benchmark runs with
cwd: process.cwd() (directory where script runs)
- Older versions (0.2.x, 1.0.x) show "Do you trust the files" security prompt
- Auto-accepts trust prompt by sending Enter key to proceed
- Each version spawns fresh in the benchmark directoryVersion Requirement Detection:
- Detects versions < 1.0.24 that show "needs update" error
- Extracts minimum version requirement (expected: 1.0.24)
- Returns
result: 'error_detected' with full error message
- Warns if minimum version changes from 1.0.24Output Structure
All benchmark data is stored in
~/.cvm/benchmarks/:`
~/.cvm/benchmarks/
├── benchmarks-all-3run.json # --version benchmarks for all versions
├── benchmark-startup-{version}.json # Individual interactive benchmarks
├── STARTUP_COMPARISON.html # Generated performance report
├── run-1/ # Multi-run comparison data
│ ├── version/
│ │ └── benchmarks-all-3run.json
│ ├── interactive/
│ │ ├── benchmark-startup-0-2-9.json
│ │ ├── benchmark-startup-2-0-42.json
│ │ └── ...
│ └── metadata.json
└── run-2/
└── ...
`Reports
$3
Generated from individual benchmark runs, showing:
- --version spawn times across all versions
- Interactive startup times across all versions
- Version viability markers (1.0.24+)
- Performance trends and outliers$3
Overlays multiple benchmark runs to show:
- Measurement consistency across runs
- Performance variance
- Reliability of benchmark dataVersion States
The benchmark tool detects three version states:
1. error_detected: Pre-0.2.103 versions that show error before UI
2. ui_then_exit: Versions 0.2.103-1.0.23 that show UI with error but immediately close
3. ready: Versions 1.0.24+ that are actually interactive
Minimum viable version: 1.0.24
Performance Data
Example benchmark results:
`json
{
"version": "2.0.42",
"results": [
{
"time": 980,
"result": "ready",
"reason": "all terminal signals received and process stable",
"signals": {
"bracketedPaste": true,
"focusEvents": true,
"prompt": true
}
}
]
}
`API
$3
`javascript
module.exports = {
name: 'benchmark',
version: '0.1.0',
description: 'Benchmark and analyze Claude Code performance',
commands: [...],
hooks: {
afterInstall: (version) => { / ... / }
}
};
`$3
`javascript
const benchmarkVersion = require('@clawd/cvm-benchmark/lib/benchmark-version');
const benchmarkInteractive = require('@clawd/cvm-benchmark/lib/benchmark-interactive');
const compareRuns = require('@clawd/cvm-benchmark/lib/compare-runs');
const comprehensiveSuite = require('@clawd/cvm-benchmark/lib/comprehensive-suite');// Run benchmarks
await benchmarkVersion.run({ runs: 3 });
await benchmarkInteractive.run('2.0.42', 3);
await comprehensiveSuite.runAll({ runNumber: 3 });
compareRuns.compare(['1', '2', '3']);
`Requirements
- Node.js >= 14.0.0
- CVM installed with at least one Claude Code version
-
node-pty for interactive benchmarksDevelopment
`bash
Clone the repo
git clone https://github.com/clawd/cvm-benchmark.git
cd cvm-benchmarkInstall dependencies
npm installRun tests
npm testRun benchmarks
node lib/benchmark-interactive.js 2.0.42
``MIT
- @clawd/cvm - Claude Version Manager
- Claude Code - Official CLI for Claude
Built by the CVM team to enable comprehensive performance testing across all Claude Code versions.
---
Status: Production-ready, actively used for benchmarking 249 Claude Code versions (0.2.x → 2.0.x)