StoryCanvas

Transform books and text into multimedia content using Google Gemini AI.

StoryCanvas is an interactive CLI tool that converts text files (TXT, PDF, EPUB) into illustrated videos with AI-generated images, narration, and background music. It leverages the Google Gemini ecosystem including Imagen for image generation, Gemini TTS for narration, and Veo for AI video generation.

Features

- Multiple Input Formats: Support for TXT, PDF, EPUB, and Markdown files
- Project Gutenberg Integration: Search and download classic books directly
- AI Image Generation: Create character and scene illustrations using Imagen 4 or Nano Banana
- TTS Narration: Generate spoken narration with 30+ voice options
- Video Creation: Choose between image slideshow or Veo AI video generation
- Background Music: Mix in royalty-free background music
- YouTube Metadata: Auto-generate titles, descriptions, and tags

Installation

``bash npm install -g storycanvas`

Or run directly with npx:

`bash npx storycanvas`

`Requirements`

- Node.js 22 or later - Google Gemini API key (Get one here)

`Quick Start`

1. Run the setup wizard:`bash storycanvas onboard`This will guide you through: - API key configuration - Model selection - Output directory setup

2. Create multimedia from a book:`bash storycanvas create --file my-book.epub`

3. Or download from Project Gutenberg:`bash storycanvas create --gutenberg 74 # Tom Sawyer`

`Commands`

`$3`

Interactive setup wizard for first-time configuration. Sets up your API key, preferred models, and output directories.

`$3`

Create multimedia content from text.

`bash

`Interactive mode`


storycanvas create
From local file

storycanvas create --file book.epub
storycanvas create --file article.pdf
storycanvas create --file story.txt
From Project Gutenberg

storycanvas create --gutenberg 74
Specify stages

storycanvas create --file book.txt --stages illustrations,video,music
Use Veo AI video instead of slideshow

storycanvas create --file book.txt --mode veo

Options: --f, --file : Path to input file --g, --gutenberg : Project Gutenberg book ID --s, --stages : Comma-separated stages (illustrations, narration, video, music, metadata) --m, --mode : Video mode (slideshow or veo)

`$3`

Browse and download books from Project Gutenberg.

`bash

`Interactive mode`


storycanvas books
Search for books

storycanvas books --search "alice wonderland"
Download by ID

storycanvas books --download 11
List downloaded books

storycanvas books --list


$3
Run diagnostics to check your setup.

`bash storycanvas doctor`

Checks: - Node.js version - FFmpeg availability - API key validity - Configuration status

`$3`

View and manage configuration.

`bash

`Show current config`


storycanvas config --show
Edit interactively

storycanvas config --edit
Reset to defaults

storycanvas config --reset
Show config file path

storycanvas config --path


Configuration

Configuration is stored in ~/.storycanvasrc. You can edit it manually or use storycanvas config --edit.

`json { "apiKey": "your-gemini-api-key", "models": { "text": "gemini-2.5-flash", "image": "imagen-4.0-fast-generate-001", "tts": "gemini-2.5-flash-preview-tts", "video": "veo-3.1-fast" }, "image": { "maxCharacterImages": 30, "maxSceneImages": 50, "aspectRatio": "9:16", "personGeneration": "allow_adult" }, "video": { "mode": "slideshow", "fps": 0.5, "resolution": "1080p" }, "tts": { "enabled": true, "voice": "Kore" }, "audio": { "musicVolume": 0.3, "narrationVolume": 1.0 }, "directories": { "output": "./storycanvas-output", "music": "./music", "books": "./books" } }`

`Available Models`

`$3`

gemini-2.5-flash

 (default, fast)
-

gemini-2.5-pro

 (enhanced reasoning)
$3

-

imagen-4.0-fast-generate-001

 (default, fast)
-

imagen-4.0-ultra-generate-001

 (highest quality)
-

imagen-4.0-generate-001

 (standard)
-

gemini-2.5-flash-image

 (Nano Banana, native Gemini)
-

gemini-3-pro-image-preview

 (Nano Banana Pro)
$3

-

gemini-2.5-flash-preview-tts

 (default)
-

gemini-2.5-pro-preview-tts


$3

-

veo-3.1-fast

 (default, faster)
-

veo-3.1

 (higher quality)
Pipeline Stages
1. Input Processing: Extract text from TXT/PDF/EPUB or download from Gutenberg
2. Illustration Generation: Create character and scene images with AI
3. Narration: Generate TTS audio from the text
4. Video Creation: Combine images into slideshow or generate with Veo
5. Background Music: Mix in audio tracks
6. Metadata Generation: Create YouTube-ready title, description, and tags
Background Music

Place your royalty-free music files in the ./music` directory (or configure a different path). Supported formats: MP3, M4A, WAV, AAC, OGG.

License

MIT

Credits

Built with:
- Google Gemini API
- @clack/prompts for terminal UI
- fluent-ffmpeg for video processing

StoryCanvas

Transform books and text into multimedia content using Google Gemini AI.

Features

Installation

``bash npm install -g storycanvas`

Or run directly with npx:

`bash npx storycanvas`

`Requirements`

- Node.js 22 or later - Google Gemini API key (Get one here)

`Quick Start`

1. Run the setup wizard:`bash storycanvas onboard`This will guide you through: - API key configuration - Model selection - Output directory setup

2. Create multimedia from a book:`bash storycanvas create --file my-book.epub`

3. Or download from Project Gutenberg:`bash storycanvas create --gutenberg 74 # Tom Sawyer`

`Commands`

`$3`

Interactive setup wizard for first-time configuration. Sets up your API key, preferred models, and output directories.

`$3`

Create multimedia content from text.

`bash

`Interactive mode`


storycanvas create
From local file

storycanvas create --file book.epub
storycanvas create --file article.pdf
storycanvas create --file story.txt
From Project Gutenberg

storycanvas create --gutenberg 74
Specify stages

storycanvas create --file book.txt --stages illustrations,video,music
Use Veo AI video instead of slideshow

storycanvas create --file book.txt --mode veo

`$3`

Browse and download books from Project Gutenberg.

`bash

`Interactive mode`


storycanvas books
Search for books

storycanvas books --search "alice wonderland"
Download by ID

storycanvas books --download 11
List downloaded books

storycanvas books --list


$3
Run diagnostics to check your setup.

`bash storycanvas doctor`

Checks: - Node.js version - FFmpeg availability - API key validity - Configuration status

`$3`

View and manage configuration.

`bash

`Show current config`


storycanvas config --show
Edit interactively

storycanvas config --edit
Reset to defaults

storycanvas config --reset
Show config file path

storycanvas config --path


Configuration

Configuration is stored in ~/.storycanvasrc. You can edit it manually or use storycanvas config --edit.

`Available Models`

`$3`

gemini-2.5-flash

 (default, fast)
-

gemini-2.5-pro

 (enhanced reasoning)
$3

-

imagen-4.0-fast-generate-001

 (default, fast)
-

imagen-4.0-ultra-generate-001

 (highest quality)
-

imagen-4.0-generate-001

 (standard)
-

gemini-2.5-flash-image

 (Nano Banana, native Gemini)
-

gemini-3-pro-image-preview

 (Nano Banana Pro)
$3

-

gemini-2.5-flash-preview-tts

 (default)
-

gemini-2.5-pro-preview-tts


$3

-

veo-3.1-fast

 (default, faster)
-

veo-3.1

 (higher quality)
Pipeline Stages
1. Input Processing: Extract text from TXT/PDF/EPUB or download from Gutenberg
2. Illustration Generation: Create character and scene images with AI
3. Narration: Generate TTS audio from the text
4. Video Creation: Combine images into slideshow or generate with Veo
5. Background Music: Mix in audio tracks
6. Metadata Generation: Create YouTube-ready title, description, and tags
Background Music

Place your royalty-free music files in the ./music` directory (or configure a different path). Supported formats: MP3, M4A, WAV, AAC, OGG.

License

MIT

Credits

Built with:
- Google Gemini API
- @clack/prompts for terminal UI
- fluent-ffmpeg for video processing