Analyze and interpret images with the Zhipu AI GLM vision models
npm install @mcpcn/image-understandingAn MCP server that uses the Zhipu AI GLM vision models to analyze and describe images.
- Supports multiple GLM vision models (glm-4v-plus, glm-4v, glm-4v-flash)
- Flexible generation controls (temperature, max tokens, etc.)
- Robust error handling with clear messaging
- TypeScript-based implementation with strong typing
- ZHIPU_API_KEY or GLM_API_KEY: Required, Zhipu AI API key
- GLM_VISION_MODEL: Optional, override the default model (defaults to glm-4v-plus)
``bash`
npm install
npm run build
`bash`Run the compiled entry
npm startOr invoke the CLI wrapper
image-understanding-mcp
Example entry for Claude Desktop:
`json`
{
"mcpServers": {
"image-understanding": {
"command": "node",
"args": ["/absolute/path/to/this/project/dist/index.js"],
"env": {
"ZHIPU_API_KEY": "your-zhipu-key"
}
}
}
}
Analyze an image via the GLM vision model suite.
Parameters:
- imageUrl (required): Image URLprompt
- (optional): Instruction for the analysis, defaults to “Describe this image in detail.”model
- (optional): Model id, default glm-4v-plustemperature
- (optional): 0-1, default 0.7maxTokens
- (optional): 1-4096, default 1024
Supported Models:
- glm-4v-plus: Highest fidelity, best for complex scenes (max 4096 tokens)glm-4v
- : Balanced performance/cost (max 2048 tokens)glm-4v-flash
- : Fastest inference for simple analysis (max 1024 tokens)
``
src/
├── index.ts # MCP server entry
├── config.ts # Model options
├── types.ts # Shared types
└── tool.ts # Tool implementation
- Node.js >= 18
- @modelcontextprotocol/sdk` v1