MCP server for narrative audio generation - enabling LLMs to create immersive audio experiences
npm install mcp-mia-narrativeModel Context Protocol server for immersive audio narrative generation.
Enable any LLM to create audio companions for their responses, transforming terminal interactions into multimodal experiences where users can close their eyes and be guided through conversation.
When an LLM uses this MCP server, it can:
- 🎭 Generate audio narrations with personality-rich voices
- 🌊 Create immersive summaries of conversation moments
- 💭 Provide contemplative audio checkpoints during long sessions
- 🔊 Transform text responses into intimate audio experiences
- 🎯 Allow users to engage with AI through both text and voice
This MCP server requires the mia-narrative CLI to be installed and configured:
``bash`
cd cli/
npm install
npm run build
npm link
npm run setup # Downloads voice models (~380MB)
Also requires:
- Node.js v18+
- FFmpeg for audio processing
- mpg123 for audio playback (or afplay on macOS)
`bash`
cd mcp-mia-narrative/
npm install
npm run build
Add to your MCP settings (e.g., Claude Desktop config):
`json`
{
"mcpServers": {
"mia-narrative": {
"command": "node",
"args": ["/absolute/path/to/mcp-mia-narrative/dist/index.js"]
}
}
}
Or use with any MCP client:
`bash`
node dist/index.js
The server provides three key resources:
Parameters:
- text or file: Content to narratevoiceId
- : Which voice to use (default: mia)engine
- : piper, system, or elevenlabs (default: piper)speed
- : Speech rate 0.5-2.0 (default: 1.0)pitch
- : Pitch adjustment 0.5-2.0 (default: 1.0)reverb
- : Reverb effect 0-1.0 (default: 0.2)autoplay
- : Auto-play after generation (default: true)
Example:
`typescript`
{
text: "We've explored the concept of structural tension...",
voiceId: "miette",
speed: 0.9,
reverb: 0.3,
autoplay: true
}
Parameters:
- conversationContext: What just happened in the conversationvoiceId
- : Voice for narration (default: miette)tone
- : intimate, professional, dramatic, or contemplative (default: intimate)autoplay
- : Auto-play (default: true)
Example:
`typescript`
{
conversationContext: "We just explored how MCPs enable multimodal AI interactions. The key insight was that audio companions transform terminal sessions into immersive experiences where users can close their eyes and absorb ideas differently.",
voiceId: "zephyr",
tone: "contemplative",
autoplay: true
}
Parameters:
- filepath: Path to text filevoiceId
- : Voice to use (default: mia)speed
- : Reading speed (default: 0.95)
Arguments:
- context: What was just discussedvoice
- : Preferred voice
Usage Pattern:
1. LLM completes text response
2. LLM uses this prompt to craft audio companion text
3. LLM calls generate_contextual_audio with the crafted text
Arguments:
- journey_summary: Summary of conversation progress
This MCP enables a new mode of human-AI interaction:
Text Channel (Primary)
- Detailed, scannable, reference-able
- Code, links, structured data
- Quick back-and-forth
Audio Channel (Companion)
- Immersive, emotional, experiential
- Synthesis and reflection
- Intimate connection
Users can:
- Read detailed responses when focused
- Listen to audio companions when eyes-closed ideating
- Experience both modalities based on their state and needs
An LLM with this MCP might work like this:
``
1. User asks about a complex topic
2. LLM provides detailed text response
3. LLM identifies key narrative thread
4. LLM calls generate_contextual_audio with:
- Distilled essence of the discussion
- Reference to user's journey
- Warm, conversational synthesis
5. User hears audio render and play
6. User can close eyes and absorb the moment
For Technical Content: Mia, Jeremy
For Conversation: Miette, Atlas
For Stories: Seraphine, Echo
For Reflection: Zephyr, ResoNova
For Drama: Seraphine, Echo
`bash`
npm run dev # Run with tsx
npm run build # Compile TypeScript
npm run watch # Watch mode
"mia-narrative: command not found"
- Ensure the CLI is built and linked: cd cli && npm run build && npm link
"Audio generated but could not autoplay"
- Install mpg123: brew install mpg123 (macOS) or apt-get install mpg123 (Linux)
"Voice models not found"
- Run setup: cd cli && npm run setup`
- Immersive Learning: Audio summaries help visual learners absorb complex topics
- Eyes-Free Ideation: Users can close eyes during creative brainstorming
- Ambient Guidance: Audio companions during long coding or writing sessions
- Conversation Milestones: Reflective checkpoints in extended dialogues
- Accessibility: Alternative modality for consuming AI responses
MIT