MCP App Server for live speech transcription
npm install @modelcontextprotocol/server-transcriptAn MCP App Server for live speech transcription using the Web Speech API.
Add to your MCP client configuration (stdio transport):
``json`
{
"mcpServers": {
"transcript": {
"command": "npx",
"args": [
"-y",
"--silent",
"--registry=https://registry.npmjs.org/",
"@modelcontextprotocol/server-transcript",
"--stdio"
]
}
}
}
To test local modifications, use this configuration (replace ~/code/ext-apps with your clone path):
`json`
{
"mcpServers": {
"transcript": {
"command": "bash",
"args": [
"-c",
"cd ~/code/ext-apps/examples/transcript-server && npm run build >&2 && node dist/index.js --stdio"
]
}
}
}
- Live Transcription: Real-time speech-to-text using browser's Web Speech API
- Transitional Model Context: Streams interim transcriptions to the model via ui/update-model-context, allowing the model to see what the user is saying as they speakui/message
- Audio Level Indicator: Visual feedback showing microphone input levels
- Send to Host: Button to send completed transcriptions as a to the MCP host
- Start/Stop Control: Toggle listening on and off
- Clear Transcript: Reset the transcript area
- Node.js 18+
- Chrome, Edge, or Safari (Web Speech API support)
`bash`
npm install
`bashDevelopment mode (with hot reload)
npm run dev
Usage
The server exposes a single tool:
$3
Opens a live speech transcription interface.
Parameters: None
Example:
`json
{
"name": "transcribe",
"arguments": {}
}
`How It Works
1. Click Start to begin listening
2. Speak into your microphone
3. Watch your speech appear as text in real-time (interim text is streamed to model context via
ui/update-model-context)
4. Click Send to send the transcript as a ui/message to the host (clears the model context)
5. Click Clear to reset the transcriptArchitecture
`
transcript-server/
├── server.ts # MCP server with transcribe tool
├── server-utils.ts # HTTP transport utilities
├── mcp-app.html # Transcript UI entry point
├── src/
│ ├── mcp-app.ts # App logic, Web Speech API integration
│ ├── mcp-app.css # Transcript UI styles
│ └── global.css # Base styles
└── dist/ # Built output (single HTML file)
`Notes
- Microphone Permission: Requires
allow="microphone" on the sandbox iframe (configured via permissions: { microphone: {} } in the resource _meta.ui`)- Language selection dropdown
- Whisper-based offline transcription (see TRANSCRIPTION.md)
- Export transcript to file
- Timestamps toggle