qw-proxy

HTTP proxy bridge to AgentRouter API via qwen-code SDK.

Allows you to access AgentRouter models through a simple REST API with minimal token overhead (~21 input tokens vs 12k+ default).

Installation

``bash npm install -g qw-proxy`

`Quick Start`

`bash

`Basic usage (12k+ input tokens)`


AGENTROUTER_API_KEY=sk-xxx qw-proxy
Optimized usage (~21 input tokens)

mkdir -p .qwen && echo "You are a helpful assistant." > .qwen/system.md
QWEN_SYSTEM_MD=1 AGENTROUTER_API_KEY=sk-xxx qw-proxy

Server starts on http://localhost:3001

`Supported Models`

| Model | Status | Notes | |-------|--------|-------| |deepseek-v3.2| ✅ | Default, recommended | |gpt-5.2| ✅ | GPT-5.2 via AgentRouter | |claude-haiku-4-5-20251001| ✅ | Fast Claude model | |claude-sonnet-4-5-20250929| ✅ | Claude Sonnet 4.5 | |glm-4.5| ✅ | GLM model | |glm-4.6 | ✅ | GLM model |

Set model via AGENTROUTER_MODEL environment variable.

`Environment Variables`

| Variable | Required | Default | Description | |----------|----------|---------|-------------| |AGENTROUTER_API_KEY| Yes | - | Your AgentRouter API key | |AGENTROUTER_MODEL | No | deepseek-v3.2| Model to use | |AGENTROUTER_BASE_URL | No | https://agentrouter.org/v1| API base URL | |PORT | No | 3001| Server port | |QWEN_SYSTEM_MD | No | - | Set to 1 for minimal tokens |

`API Reference`

`$3`

Send a message and get AI response.

Request:`json { "user_id": "user123", "message": "Hello!", "images": ["/path/to/image.png"] }`

| Field | Type | Required | Description | |-------|------|----------|-------------| |user_id| string | Yes | Unique user ID for conversation history | |message| string | Yes | User message | |images | string[] | No | Local image paths (vision models) |

Response:`json { "success": true, "response": "Hello! How can I help you?", "history_length": 3 }`

Example:`bash curl http://localhost:3001/chat \ -H "Content-Type: application/json" \ -d '{"user_id": "test", "message": "Hello!"}'`

`$3`

Health check.

`json { "status": "ok", "model": "deepseek-v3.2", "active_users": 1, "active_clients": 1 }`

`$3`

Clear conversation history.

`bash curl -X POST http://localhost:3001/clear \ -H "Content-Type: application/json" \ -d '{"user_id": "user123"}'`

`$3`

Cancel ongoing generation.

`bash curl -X POST http://localhost:3001/cancel/user123`

`$3`

Get conversation compression statistics.

`Features`

- Minimal tokens: ~21 input tokens with QWEN_SYSTEM_MD=1(vs 12k+ default) - Per-user history: Isolated conversation contexts - Auto-compression: LLM-based history compression when limit exceeded - Vision support: Send images with messages - Request cancellation: Cancel ongoing generations - 30s timeout: Protection against hanging requests

`How It Works`

`Your App → qw-proxy (localhost:3001) → qwen-code SDK → AgentRouter API → LLM`

qwen-code SDK handles authentication with AgentRouter. This proxy exposes it as a simple REST API.

`Token Optimization`

Default qwen-code includes ~12k tokens of system prompts, tools, and environment context.

To reduce to ~21 tokens:

1. Create .qwen/system.mdwith minimal prompt 2. SetQWEN_SYSTEM_MD=1`

The package has patched qwen-code to disable tools and environment context injection.

License

MIT

qw-proxy

HTTP proxy bridge to AgentRouter API via qwen-code SDK.

Allows you to access AgentRouter models through a simple REST API with minimal token overhead (~21 input tokens vs 12k+ default).

Installation

``bash npm install -g qw-proxy`

`Quick Start`

`bash

`Basic usage (12k+ input tokens)`


AGENTROUTER_API_KEY=sk-xxx qw-proxy
Optimized usage (~21 input tokens)

mkdir -p .qwen && echo "You are a helpful assistant." > .qwen/system.md
QWEN_SYSTEM_MD=1 AGENTROUTER_API_KEY=sk-xxx qw-proxy

Server starts on http://localhost:3001

`Supported Models`

Set model via AGENTROUTER_MODEL environment variable.

`Environment Variables`

`API Reference`

`$3`

Send a message and get AI response.

Request:`json { "user_id": "user123", "message": "Hello!", "images": ["/path/to/image.png"] }`

Response:`json { "success": true, "response": "Hello! How can I help you?", "history_length": 3 }`

Example:`bash curl http://localhost:3001/chat \ -H "Content-Type: application/json" \ -d '{"user_id": "test", "message": "Hello!"}'`

`$3`

Health check.

`json { "status": "ok", "model": "deepseek-v3.2", "active_users": 1, "active_clients": 1 }`

`$3`

Clear conversation history.

`bash curl -X POST http://localhost:3001/clear \ -H "Content-Type: application/json" \ -d '{"user_id": "user123"}'`

`$3`

Cancel ongoing generation.

`bash curl -X POST http://localhost:3001/cancel/user123`

`$3`

Get conversation compression statistics.

`Features`

`How It Works`

`Your App → qw-proxy (localhost:3001) → qwen-code SDK → AgentRouter API → LLM`

qwen-code SDK handles authentication with AgentRouter. This proxy exposes it as a simple REST API.

`Token Optimization`

Default qwen-code includes ~12k tokens of system prompts, tools, and environment context.

To reduce to ~21 tokens:

1. Create .qwen/system.mdwith minimal prompt 2. SetQWEN_SYSTEM_MD=1`

The package has patched qwen-code to disable tools and environment context injection.

License

MIT