Voice input for Claude Code — real-time local speech-to-text via whisper.cpp
npm install sotto

Voice input for Claude Code. Speak instead of typing.
A local, open-source MCP server that streams your voice to whisper.cpp for real-time transcription and sends the text to Claude Code. Everything runs on your machine — no cloud APIs, no network calls.
> macOS only. Sotto uses osascript and the Cocoa framework for its floating status indicator. Linux and Windows are not supported.
```
You speak → sotto streams audio to whisper-stream for live transcription
→ a floating indicator shows status and live text
→ silence detected or you click stop → text returned to Claude
→ Claude treats it as your message and responds
- macOS (Apple Silicon recommended, Intel works too)
- Node.js >= 18
- whisper-cpp — local speech-to-text with live streaming
Install system dependencies:
`bash`
brew install whisper-cpp
`bash`
npm install -g sotto
sotto-setup
The setup command will:
1. Verify whisper-stream is installed (ships with whisper-cpp)~/.local/share/sotto/models/
2. Download the Whisper Base English model (~150MB) to ~/.config/sotto/config.json
3. Create a default config at
Then register with Claude Code:
`bashAvailable in all projects (recommended for most users)
claude mcp add sotto -s user -- sotto
Use user scope if you want voice input everywhere. Use local scope if you only want sotto in a specific project.
On first use, macOS will prompt you to grant microphone access to your terminal app (Terminal, iTerm2, etc.) in System Settings > Privacy & Security > Microphone.
Usage
In Claude Code, type:
`
/sotto:listen
`A floating indicator appears at the bottom of your screen showing:
- Recording status (listening / transcribing)
- Live transcription text as you speak
- A stop button to end recording early
Recording stops automatically after silence is detected, or when you click the stop button. Your speech is transcribed and sent to Claude as text.
Configuration
Edit
~/.config/sotto/config.json:| Setting | Default | Env Var | Description |
|---|---|---|---|
|
modelPath | ~/.local/share/sotto/models/ggml-base.en.bin | WHISPER_MODEL_PATH | Path to GGML model |
| language | en | WHISPER_LANGUAGE | Language code |
| maxDuration | 30 | WHISPER_MAX_DURATION | Max recording seconds |Environment variables take precedence over the config file.
Troubleshooting
| Problem | Solution |
|---|---|
| "whisper-stream is not installed" |
brew install whisper-cpp |
| "Model not found" | Run sotto-setup |
| "Microphone access denied" | Grant mic access to your terminal in System Settings > Privacy & Security > Microphone |
| No speech detected | Make sure your microphone is working and you're speaking loudly enough |
| Transcription is slow | The base model is ~3s for a 5s clip on Apple Silicon. Try the tiny model for faster results. |Development
`bash
git clone https://github.com/sourabhbgp/sotto.git
cd sotto
npm install
npm run build
npm test
``MIT