🎙️ MCP Mia-Narrative

Model Context Protocol server for immersive audio narrative generation.

Enable any LLM to create audio companions for their responses, transforming terminal interactions into multimodal experiences where users can close their eyes and be guided through conversation.

What This Enables

When an LLM uses this MCP server, it can:
- 🎭 Generate audio narrations with personality-rich voices
- 🌊 Create immersive summaries of conversation moments
- 💭 Provide contemplative audio checkpoints during long sessions
- 🔊 Transform text responses into intimate audio experiences
- 🎯 Allow users to engage with AI through both text and voice

Prerequisites

This MCP server requires the mia-narrative CLI to be installed and configured:

``bash cd cli/ npm install npm run build npm link npm run setup # Downloads voice models (~380MB)`

Also requires: - Node.js v18+ - FFmpeg for audio processing - mpg123 for audio playback (or afplay on macOS)

`Installation`

`bash cd mcp-mia-narrative/ npm install npm run build`

`Configuration`

Add to your MCP settings (e.g., Claude Desktop config):

`json { "mcpServers": { "mia-narrative": { "command": "node", "args": ["/absolute/path/to/mcp-mia-narrative/dist/index.js"] } } }`

Or use with any MCP client:

`bash node dist/index.js`

`Resources`

The server provides three key resources:

`$3`


Complete catalog of available voices with personality profiles:
- Mia (professional, technical)
- Miette (conversational, warm)
- Seraphine (dramatic, expressive)
- Jeremy (authoritative male)
- Atlas (casual male)
- ResoNova (experimental)
- Zephyr (contemplative)
- Echo (playful)
$3

Comprehensive guide for creating effective audio narratives, including:
- Voice selection strategies
- Audio parameter tuning
- Best practices for different content types
- Example use cases
$3

LLM-specific guidelines for when and how to generate audio companions:
- Timing considerations
- Content crafting techniques
- Integration patterns
- The multimodal innovation
Tools
$3

Core tool for converting text to speech with full control.

Parameters: -text or file: Content to narrate -voiceId: Which voice to use (default: mia) -engine: piper, system, or elevenlabs (default: piper) -speed: Speech rate 0.5-2.0 (default: 1.0) -pitch: Pitch adjustment 0.5-2.0 (default: 1.0) -reverb: Reverb effect 0-1.0 (default: 0.2) -autoplay: Auto-play after generation (default: true)

Example:`typescript { text: "We've explored the concept of structural tension...", voiceId: "miette", speed: 0.9, reverb: 0.3, autoplay: true }`

`$3`


High-level tool for creating conversation companions.

Parameters: -conversationContext: What just happened in the conversation -voiceId: Voice for narration (default: miette) -tone: intimate, professional, dramatic, or contemplative (default: intimate) -autoplay: Auto-play (default: true)

Example:`typescript { conversationContext: "We just explored how MCPs enable multimodal AI interactions. The key insight was that audio companions transform terminal sessions into immersive experiences where users can close their eyes and absorb ideas differently.", voiceId: "zephyr", tone: "contemplative", autoplay: true }`

`$3`


Get all available voices with descriptions.
$3

Read any text file with a specified voice.

Parameters: -filepath: Path to text file -voiceId: Voice to use (default: mia) -speed: Reading speed (default: 0.95)

`Prompts`

`$3`


Helps LLMs craft effective audio companions for their responses.

Arguments: -context: What was just discussed -voice: Preferred voice

Usage Pattern: 1. LLM completes text response 2. LLM uses this prompt to craft audio companion text 3. LLM callsgenerate_contextual_audio with the crafted text

`$3`


Creates reflective audio checkpoints during conversations.

Arguments: -journey_summary: Summary of conversation progress

`The Innovation: Dual-Channel Communication`

This MCP enables a new mode of human-AI interaction:

Text Channel (Primary) - Detailed, scannable, reference-able - Code, links, structured data - Quick back-and-forth

Audio Channel (Companion) - Immersive, emotional, experiential - Synthesis and reflection - Intimate connection

Users can: - Read detailed responses when focused - Listen to audio companions when eyes-closed ideating - Experience both modalities based on their state and needs

`Example LLM Integration`

An LLM with this MCP might work like this:

`1. User asks about a complex topic 2. LLM provides detailed text response 3. LLM identifies key narrative thread 4. LLM calls generate_contextual_audio with: - Distilled essence of the discussion - Reference to user's journey - Warm, conversational synthesis 5. User hears audio render and play 6. User can close eyes and absorb the moment`

`Voice Selection Guide`

For Technical Content: Mia, Jeremy For Conversation: Miette, Atlas For Stories: Seraphine, Echo For Reflection: Zephyr, ResoNova For Drama: Seraphine, Echo

`Development`

`bash npm run dev # Run with tsx npm run build # Compile TypeScript npm run watch # Watch mode`

`Troubleshooting`

"mia-narrative: command not found" - Ensure the CLI is built and linked:cd cli && npm run build && npm link

"Audio generated but could not autoplay" - Install mpg123:brew install mpg123 (macOS) or apt-get install mpg123 (Linux)

"Voice models not found" - Run setup:cd cli && npm run setup`

Use Cases

- Immersive Learning: Audio summaries help visual learners absorb complex topics
- Eyes-Free Ideation: Users can close eyes during creative brainstorming
- Ambient Guidance: Audio companions during long coding or writing sessions
- Conversation Milestones: Reflective checkpoints in extended dialogues
- Accessibility: Alternative modality for consuming AI responses

License

MIT

🎙️ MCP Mia-Narrative

Model Context Protocol server for immersive audio narrative generation.

Enable any LLM to create audio companions for their responses, transforming terminal interactions into multimodal experiences where users can close their eyes and be guided through conversation.

What This Enables

Prerequisites

This MCP server requires the mia-narrative CLI to be installed and configured:

``bash cd cli/ npm install npm run build npm link npm run setup # Downloads voice models (~380MB)`

Also requires: - Node.js v18+ - FFmpeg for audio processing - mpg123 for audio playback (or afplay on macOS)

`Installation`

`bash cd mcp-mia-narrative/ npm install npm run build`

`Configuration`

Add to your MCP settings (e.g., Claude Desktop config):

`json { "mcpServers": { "mia-narrative": { "command": "node", "args": ["/absolute/path/to/mcp-mia-narrative/dist/index.js"] } } }`

Or use with any MCP client:

`bash node dist/index.js`

`Resources`

The server provides three key resources:

`$3`


Complete catalog of available voices with personality profiles:
- Mia (professional, technical)
- Miette (conversational, warm)
- Seraphine (dramatic, expressive)
- Jeremy (authoritative male)
- Atlas (casual male)
- ResoNova (experimental)
- Zephyr (contemplative)
- Echo (playful)
$3

Comprehensive guide for creating effective audio narratives, including:
- Voice selection strategies
- Audio parameter tuning
- Best practices for different content types
- Example use cases
$3

LLM-specific guidelines for when and how to generate audio companions:
- Timing considerations
- Content crafting techniques
- Integration patterns
- The multimodal innovation
Tools
$3

Core tool for converting text to speech with full control.

Example:`typescript { text: "We've explored the concept of structural tension...", voiceId: "miette", speed: 0.9, reverb: 0.3, autoplay: true }`

`$3`


High-level tool for creating conversation companions.

`$3`


Get all available voices with descriptions.
$3

Read any text file with a specified voice.

Parameters: -filepath: Path to text file -voiceId: Voice to use (default: mia) -speed: Reading speed (default: 0.95)

`Prompts`

`$3`


Helps LLMs craft effective audio companions for their responses.

Arguments: -context: What was just discussed -voice: Preferred voice

Usage Pattern: 1. LLM completes text response 2. LLM uses this prompt to craft audio companion text 3. LLM callsgenerate_contextual_audio with the crafted text

`$3`


Creates reflective audio checkpoints during conversations.

Arguments: -journey_summary: Summary of conversation progress

`The Innovation: Dual-Channel Communication`

This MCP enables a new mode of human-AI interaction:

Text Channel (Primary) - Detailed, scannable, reference-able - Code, links, structured data - Quick back-and-forth

Audio Channel (Companion) - Immersive, emotional, experiential - Synthesis and reflection - Intimate connection

Users can: - Read detailed responses when focused - Listen to audio companions when eyes-closed ideating - Experience both modalities based on their state and needs

`Example LLM Integration`

An LLM with this MCP might work like this:

`Voice Selection Guide`

For Technical Content: Mia, Jeremy For Conversation: Miette, Atlas For Stories: Seraphine, Echo For Reflection: Zephyr, ResoNova For Drama: Seraphine, Echo

`Development`

`bash npm run dev # Run with tsx npm run build # Compile TypeScript npm run watch # Watch mode`

`Troubleshooting`

"mia-narrative: command not found" - Ensure the CLI is built and linked:cd cli && npm run build && npm link

"Audio generated but could not autoplay" - Install mpg123:brew install mpg123 (macOS) or apt-get install mpg123 (Linux)

"Voice models not found" - Run setup:cd cli && npm run setup`

Use Cases

License

MIT