Enterprise-grade AI integration bridge connecting Claude Code, Gemini CLI, and Google AI Studio with intelligent routing and advanced multimodal processing capabilities
npm install claude-gemini-multimodal-bridge
$3Optimally integrates Claude's reasoning power, Gemini CLI's search capabilities, and AI Studio's generation power. Ahead of the 2026 AI trend: "Specialized AI Collaboration" |
$3Complete with a single npm install. Tedious setup is automated
|
$3Follows the Anthropic Model Context Protocol. Enterprise-grade reliability with 95% self-healing rate |
gemini-2.5-flash, gemini-3-flash |
mermaid
flowchart TD
A[Claude Code] --> B[CGMB]
B --> C[Gemini CLI]
B --> D[Claude Code]
B --> E[AI Studio]
`
| Layer | Specialization | Timeout |
|:-----:|:---------------|:-------:|
| š Gemini CLI | Web search, real-time information | 30s |
| š§ Claude Code | Complex reasoning, code analysis | 300s |
| šØ AI Studio | Image generation, audio synthesis, OCR | 120s |
---
š Quick Start
$3
- Node.js ā„ 22.0.0
- Claude Code CLI installed
- Gemini CLI (auto-installed)
$3
`bash
npm install -g claude-gemini-multimodal-bridge
`
> š” The postinstall script automatically:
> - Installs Gemini CLI
> - Sets up Claude Code MCP integration
> - Creates .env template
> - Verifies system requirements
$3
Create a .env file in your working directory:
`bash
AI_STUDIO_API_KEY=your_api_key_here
`
š Get API key: https://aistudio.google.com/app/apikey
$3
`bash
gemini
`
$3
`
I installed CGMB via NPM. Please check my current environment for the cgmb command and help me use it.
`
---
š” Usage Examples
CGMB integrates seamlessly with Claude Code. Just use the "CGMB" keyword:
`bash
šØ Image generation
"CGMB generate an image of a futuristic city"
š Document analysis (use absolute paths)
"CGMB analyze the document at /full/path/to/report.pdf"
š URL analysis
"CGMB analyze https://example.com/document.pdf"
š Web search
"CGMB search for the latest AI news"
šµ Audio generation
"CGMB create audio saying 'Welcome to our podcast'"
š OCR-enabled PDF analysis
"CGMB analyze this scanned PDF document with OCR"
`
$3
1. Include "CGMB" in your Claude Code request
2. CGMB automatically routes to the optimal AI layer:
- š Gemini CLI: Web search, latest information
- šØ AI Studio: Images, audio, file processing
- š§ Claude Code: Complex reasoning, code analysis
---
š¤ Models Used
| Purpose | Model ID | Layer |
|:-------:|:---------|:-----:|
| š Web Search | gemini-3-flash | Gemini CLI |
| šØ Image Generation | gemini-2.5-flash-image | AI Studio |
| šµ Audio Generation | gemini-2.5-flash-preview-tts | AI Studio |
| š Document Processing | gemini-2.5-flash | AI Studio |
| š OCR/Text Extraction | gemini-2.5-flash | AI Studio |
| š® General Multimodal | gemini-2.0-flash-exp | AI Studio |
---
š Performance
$3
Authentication Overhead Reduction
$3
Search Cache Hit Rate
$3
Automatic Error Recovery Rate
---
š PDF Processing & OCR
$3
- ā
Supports both text-based and scanned PDFs
- ā
Automatic OCR detection
- ā
Native OCR processing via Gemini File API
- ā
Multi-language support
$3
`
PDF Input ā Upload ā OCR Processing ā Content Analysis ā Output Results
`
$3
- Text-based PDFs
- Scanned PDFs (OCR processing)
- Image-based PDFs (OCR conversion)
- Mixed content
- Complex layouts (tables, charts, formatted content)
---
š File Organization
Generated content is automatically organized:
`
output/
āāā images/ # šØ Generated images
āāā audio/ # šµ Generated audio files
āāā documents/ # š Processed documents
`
Access via Claude Code:
- get_generated_file: Retrieve specific files
- list_generated_files: List all generated files
- get_file_info: Get file metadata
---
š§ Configuration
$3
`bash
Required
AI_STUDIO_API_KEY=your_api_key_here
Optional
GEMINI_API_KEY=your_api_key_here
ENABLE_CACHING=true
CACHE_TTL=3600
LOG_LEVEL=info
`
$3
CGMB automatically configures Claude Code MCP integration:
- š Config path: ~/.claude-code/mcp_servers.json
- ā” Direct Node.js execution
- š Safe merge without overwriting existing servers
---
šŖ Windows Environment
CGMB fully supports Windows in v1.1.0:
| Feature | Status |
|---------|:------:|
| CLI | ā
All commands work |
| MCP Integration | ā
MCP tool calls work correctly |
| Path Resolution | ā
Automatically handles C:\path\to\file format |
| Gemini CLI | ā
Full compatibility with Windows version |
`powershell
Absolute paths recommended
cgmb analyze "C:\Users\name\Documents\report.pdf"
Set environment variable (PowerShell)
$env:AI_STUDIO_API_KEY = "your_api_key_here"
Set environment variable (Command Prompt)
set AI_STUDIO_API_KEY=your_api_key_here
`
---
š§ Linux / WSL Environment
CGMB works fully on Linux and WSL:
| Feature | Status |
|---------|:------:|
| CLI | ā
All commands work |
| MCP Integration | ā
MCP tool calls work correctly |
| Path Resolution | ā
Supports /mnt/ WSL paths and Unix paths |
| Gemini CLI | ā
Full compatibility with Linux version |
`bash
Use Unix path format
cgmb analyze /home/user/documents/report.pdf
WSL environment example
cgmb analyze /mnt/c/Users/name/Documents/report.pdf
Set environment variables
export AI_STUDIO_API_KEY="your_api_key_here"
export CGMB_CHAT_MODEL="gemini-2.5-flash"
`
---
š Troubleshooting
$3
`bash
export CGMB_DEBUG=true
export LOG_LEVEL=debug
cgmb serve --debug
`
$3
If OCR results are inaccurate:
- Use high-resolution scanned PDFs (300+ DPI)
- Ensure clear, high-contrast text
- Avoid skewed or rotated documents
If large documents timeout:
- Split large PDFs before processing (limit: 50MB, 1,000 pages)
- Extend timeout: export AI_STUDIO_TIMEOUT=180000
---
š° API Costs
CGMB uses pay-per-use APIs:
- š Google AI Studio API Pricing Details
---
š Project Structure
`
src/
āāā core/ # šÆ Main MCP server and layer management
āāā layers/ # š AI layer implementations
āāā auth/ # š Authentication system
āāā tools/ # š ļø Processing tools
āāā workflows/ # š Workflow implementations
āāā utils/ # š§ Utilities and helpers
āāā mcp-servers/ # š Custom MCP servers
``
$3- GitHub - NPM - Issues |
$3- Claude Code - Gemini CLI - Google AI Studio - MCP |
$3- Google AI Studio - Claude - Gemini API |