OpenAI-compatible proxy for Google Vertex AI (Claude + Gemini) with automatic failover, retries, and prompt caching
npm install vertex-ai-proxy

A proxy server that lets you use Google Vertex AI models (Claude, Gemini, Imagen) with OpenClaw, Clawdbot, and other OpenAI-compatible tools.
```
┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ OpenClaw │────▶│ Vertex Proxy │────▶│ Vertex AI API │
│ Clawdbot │◀────│ (This Server) │◀────│ Claude/Gemini │
└─────────────┘ └──────────────────┘ └─────────────────┘
- 🤖 Multi-model support: Claude (Opus, Sonnet, Haiku), Gemini, Imagen
- 🔄 Format conversion: Translates between OpenAI ↔ Anthropic API formats
- 📡 Streaming: Full SSE streaming support
- 🏷️ Model aliases: Create friendly names like my-assistant → claude-opus-4-5start
- 🔀 Fallback chains: Automatic failover when models are unavailable
- 🌍 Dynamic region fallback: Automatically tries us-east5 → us-central1 → europe-west1
- 📏 Context management: Auto-truncate messages to fit model limits
- 🔐 Google ADC: Uses Application Default Credentials (no API keys needed)
- 🔧 Daemon mode: Run as background service with /stop/restartlogs
- 📝 Logging: Built-in log management with command
`bash`
npm install -g vertex-ai-proxy
`bashAuthenticate
gcloud auth application-default login
$3
`bash
Start the proxy
vertex-ai-proxy start --project YOUR_PROJECT_IDCheck status
vertex-ai-proxy status
`CLI Commands
$3
`bash
Start as background daemon
vertex-ai-proxy start
vertex-ai-proxy start --port 8001 --project your-projectStop the daemon
vertex-ai-proxy stopRestart
vertex-ai-proxy restartCheck status (running, uptime, request count, health)
vertex-ai-proxy statusView logs
vertex-ai-proxy logs # Last 50 lines
vertex-ai-proxy logs -n 100 # Last 100 lines
vertex-ai-proxy logs -f # Follow (tail -f style)
`$3
`bash
List all available models
vertex-ai-proxy modelsShow detailed model info
vertex-ai-proxy models info claude-opus-4-5@20251101Show all details including pricing
vertex-ai-proxy models list --allCheck which models are enabled in your Vertex AI project
vertex-ai-proxy models fetchEnable a model in your config
vertex-ai-proxy models enable claude-opus-4-5@20251101Enable with an alias
vertex-ai-proxy models enable claude-opus-4-5@20251101 --alias opusDisable a model
vertex-ai-proxy models disable gemini-2.5-flash
`$3
`bash
Show current configuration
vertex-ai-proxy configInteractive configuration setup
vertex-ai-proxy config setSet default model
vertex-ai-proxy config set-default claude-sonnet-4-5@20250514Add a model alias
vertex-ai-proxy config add-alias fast claude-haiku-4-5@20251001Remove an alias
vertex-ai-proxy config remove-alias fastSet fallback chain
vertex-ai-proxy config set-fallback claude-opus-4-5@20251101 claude-sonnet-4-5@20250514 gemini-2.5-proExport configuration for OpenClaw
vertex-ai-proxy config export
vertex-ai-proxy config export -o openclaw-snippet.json
`$3
`bash
Check Google Cloud setup (auth, ADC, project)
vertex-ai-proxy checkConfigure OpenClaw integration
vertex-ai-proxy setup-openclawInstall as systemd service
vertex-ai-proxy install-service --user # User service (no sudo)
vertex-ai-proxy install-service # System service (requires sudo)
`Prerequisites
- Google Cloud CLI: Install here
- GCP Project with Vertex AI enabled
- Claude Access: Enable in Model Garden (search "Claude" → click Enable)
Configuration
$3
`bash
Required
export GOOGLE_CLOUD_PROJECT="your-project-id"Optional (with defaults)
export VERTEX_PROXY_PORT="8001"
export VERTEX_PROXY_REGION="us-east5" # For Claude
export VERTEX_PROXY_GOOGLE_REGION="us-central1" # For Gemini/Imagen
`$3
Create
~/.vertex-proxy/config.yaml:`yaml
Google Cloud Settings
project_id: "your-project-id"
default_region: "us-east5"
google_region: "us-central1"Model Aliases (optional)
model_aliases:
my-best: "claude-opus-4-5@20251101"
my-fast: "claude-haiku-4-5@20251001"
my-cheap: "gemini-2.5-flash-lite"
# OpenAI compatibility
gpt-4: "claude-opus-4-5@20251101"
gpt-4o: "claude-sonnet-4-5@20250514"
gpt-4o-mini: "claude-haiku-4-5@20251001"Fallback Chains (optional)
fallback_chains:
claude-opus-4-5@20251101:
- "claude-sonnet-4-5@20250514"
- "gemini-2.5-pro"Context Management
auto_truncate: true
reserve_output_tokens: 4096
`$3
The proxy stores runtime data in
~/.vertex_proxy/:-
proxy.log - Request/error logs
- proxy.pid - Daemon PID file
- stats.json - Runtime statistics (uptime, request count)Clawdbot Integration
$3
Clawdbot normally uses Anthropic's API directly, but you can route it through the Vertex AI Proxy by setting up a "fake" auth profile. This lets you use your Google Cloud credits and take advantage of Vertex AI's infrastructure.
#### Step 1: Start the Proxy
`bash
Start the proxy daemon
vertex-ai-proxy start --project YOUR_GCP_PROJECTVerify it's running
vertex-ai-proxy status
`#### Step 2: Configure Clawdbot
Add to your Clawdbot config (
~/.clawdbot/clawdbot.json or equivalent):`json
{
"models": {
"mode": "merge",
"providers": {
"vertex": {
"baseUrl": "http://localhost:8001/v1",
"apiKey": "vertex-proxy-fake-key",
"api": "anthropic-messages",
"models": [
{
"id": "claude-opus-4-5@20251101",
"name": "Claude Opus 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "claude-sonnet-4-5@20250514",
"name": "Claude Sonnet 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "claude-haiku-4-5@20251001",
"name": "Claude Haiku 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
}
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "vertex/claude-sonnet-4-5@20250514"
}
}
}
}
`#### Step 3: Using Model Aliases
You can use the built-in aliases for convenience:
`json
{
"agents": {
"defaults": {
"model": {
"primary": "vertex/sonnet"
}
},
"my-agent": {
"model": {
"primary": "vertex/opus"
}
}
}
}
`The proxy automatically maps:
-
opus → claude-opus-4-5@20251101
- sonnet → claude-sonnet-4-5@20250514
- haiku → claude-haiku-4-5@20251001
- gpt-4 → claude-opus-4-5@20251101
- gpt-4o → claude-sonnet-4-5@20250514#### Why Use Vertex AI Proxy with Clawdbot?
1. Cost management: Use Google Cloud credits and billing
2. Enterprise features: VPC Service Controls, audit logging
3. Region control: Run in specific regions for compliance
4. Automatic failover: Built-in region fallback for reliability
5. No separate API key: Uses your existing GCP authentication
OpenClaw Integration
$3
Run the setup script to automatically configure OpenClaw:
`bash
After installing vertex-ai-proxy
npx vertex-ai-proxy setup-openclaw
`$3
Add to your
~/.openclaw/openclaw.json:`json
{
"env": {
"GOOGLE_CLOUD_PROJECT": "your-project-id",
"GOOGLE_CLOUD_LOCATION": "us-east5"
},
"agents": {
"defaults": {
"model": {
"primary": "vertex/claude-opus-4-5@20251101"
},
"models": {
"vertex/claude-opus-4-5@20251101": { "alias": "opus" },
"vertex/claude-sonnet-4-5@20250514": { "alias": "sonnet" },
"vertex/claude-haiku-4-5@20251001": { "alias": "haiku" }
}
}
},
"models": {
"mode": "merge",
"providers": {
"vertex": {
"baseUrl": "http://localhost:8001/v1",
"apiKey": "vertex-proxy",
"api": "anthropic-messages",
"models": [
{
"id": "claude-opus-4-5@20251101",
"name": "Claude Opus 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "claude-sonnet-4-5@20250514",
"name": "Claude Sonnet 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "claude-haiku-4-5@20251001",
"name": "Claude Haiku 4.5 (Vertex)",
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 8192
},
{
"id": "gemini-3-pro",
"name": "Gemini 3 Pro (Vertex)",
"input": ["text", "image", "audio", "video"],
"contextWindow": 1000000,
"maxTokens": 8192
},
{
"id": "gemini-2.5-pro",
"name": "Gemini 2.5 Pro (Vertex)",
"input": ["text", "image"],
"contextWindow": 1000000,
"maxTokens": 8192
},
{
"id": "gemini-2.5-flash",
"name": "Gemini 2.5 Flash (Vertex)",
"input": ["text", "image"],
"contextWindow": 1000000,
"maxTokens": 8192
}
]
}
}
}
}
`$3
`bash
Install and enable as systemd service
sudo npx vertex-ai-proxy install-serviceOr use the daemon commands
vertex-ai-proxy start
openclaw gateway restart
`API Endpoints
| Endpoint | Description |
|----------|-------------|
|
GET / | Health check and server info |
| GET /health | Simple health check with stats |
| GET /v1/models | List available models |
| POST /v1/chat/completions | OpenAI-compatible chat (recommended) |
| POST /v1/messages | Anthropic Messages API |
| POST /v1/images/generations | Image generation (Imagen) |$3
Chat Completion (OpenAI format):
`bash
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-5@20251101",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": true
}'
`Chat Completion (Anthropic format):
`bash
curl http://localhost:8001/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-opus-4-5@20251101",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Hello!"}]
}'
`Image Generation:
`bash
curl http://localhost:8001/v1/images/generations \
-H "Content-Type: application/json" \
-d '{
"model": "imagen-4.0-generate-001",
"prompt": "A cute robot learning to paint",
"n": 1,
"size": "1024x1024"
}'
`Available Models
$3
| Model | ID | Context | Price (per 1M tokens) |
|-------|----|---------|-----------------------|
| Opus 4.5 |
claude-opus-4-5@20251101 | 200K | $15 / $75 |
| Sonnet 4.5 | claude-sonnet-4-5@20250514 | 200K | $3 / $15 |
| Haiku 4.5 | claude-haiku-4-5@20251001 | 200K | $0.25 / $1.25 |$3
| Model | ID | Context | Price (per 1M tokens) | Best For |
|-------|----|---------|-----------------------|----------|
| Gemini 3 Pro |
gemini-3-pro | 1M | $2.50 / $15 | Latest & greatest |
| Gemini 2.5 Pro | gemini-2.5-pro | 1M | $1.25 / $5 | Complex reasoning |
| Gemini 2.5 Flash | gemini-2.5-flash | 1M | $0.15 / $0.60 | Fast responses |
| Gemini 2.5 Flash Lite | gemini-2.5-flash-lite | 1M | $0.075 / $0.30 | Budget-friendly |$3
| Model | ID | Description | Price |
|-------|-----|-------------|-------|
| Imagen 4 |
imagen-4.0-generate-001 | Best quality | ~$0.04/image |
| Imagen 4 Fast | imagen-4.0-fast-generate-001 | Lower latency | ~$0.02/image |
| Imagen 4 Ultra | imagen-4.0-ultra-generate-001 | Highest quality | ~$0.08/image |Troubleshooting
$3
1. Check your project ID is correct
2. Ensure Claude is enabled in Model Garden
3. Verify you're using a supported region (
us-east5 or europe-west1 for Claude)$3
`bash
Re-authenticate
gcloud auth application-default loginCheck current credentials
gcloud auth application-default print-access-token
`$3
Ensure the model is defined in
models.providers.vertex.models[] in your config.$3
Check that your client supports SSE (Server-Sent Events). The proxy sends:
`
data: {"choices":[{"delta":{"content":"Hello"}}]}data: [DONE]
`$3
`bash
View recent logs
vertex-ai-proxy logsFollow logs in real-time
vertex-ai-proxy logs -f
`Development
`bash
Clone and install
git clone https://github.com/anthropics/vertex-ai-proxy.git
cd vertex-ai-proxy
npm installRun in development mode
npm run devRun tests
npm testBuild
npm run build
`License
MIT License - see LICENSE for details.
Contributing
Contributions welcome! Please read CONTRIBUTING.md first.
Related Projects
- OpenClaw - Personal AI assistant
- Clawdbot - Discord/multi-platform AI bot
- Anthropic Vertex SDK - Official Python SDK
- Google Vertex AI - Google's AI platform
Google Search Grounding
Enable real-time web search for Gemini models to get up-to-date information.
$3
`bash
Via header
curl http://localhost:8001/v1/chat/completions \
-H "X-Enable-Grounding: true" \
-H "Content-Type: application/json" \
-d '{"model": "gemini-2.5-flash", "messages": [{"role": "user", "content": "Bitcoin price today"}]}'Via body parameter
curl http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-flash",
"messages": [{"role": "user", "content": "Latest news about AI"}],
"grounding": true
}'With custom threshold (0-1, lower = more likely to search)
curl http://localhost:8001/v1/chat/completions \
-d '{
"model": "gemini-2.5-flash",
"messages": [...],
"grounding": {"mode": "MODE_DYNAMIC", "dynamicThreshold": 0.3}
}'
`$3
Enable grounding by default in
~/.vertex-proxy/config.yaml:`yaml
grounding:
enabled: true
mode: MODE_DYNAMIC
dynamicThreshold: 0.3
`$3
When grounding is used, the response includes source information:
`json
{
"choices": [...],
"grounding": {
"web_search_queries": ["bitcoin price USD today"],
"sources": [
{"uri": "https://...", "title": "..."}
]
}
}
`Supported models:
gemini-3-pro-preview, gemini-2.5-pro, gemini-2.5-flash`