npm install @mcpcn/image-understanding

Image Understanding MCP

An MCP server that uses the Zhipu AI GLM vision models to analyze and describe images.

Features

- Supports multiple GLM vision models (glm-4v-plus, glm-4v, glm-4v-flash)
- Flexible generation controls (temperature, max tokens, etc.)
- Robust error handling with clear messaging
- TypeScript-based implementation with strong typing

Environment Variables

- ZHIPU_API_KEY or GLM_API_KEY: Required, Zhipu AI API key
- GLM_VISION_MODEL: Optional, override the default model (defaults to glm-4v-plus)

Install & Build

``bash npm install npm run build`

`Run`

`bash

`Run the compiled entry`


npm start
Or invoke the CLI wrapper

image-understanding-mcp


MCP Client Configuration
Example entry for Claude Desktop:

`json { "mcpServers": { "image-understanding": { "command": "node", "args": ["/absolute/path/to/this/project/dist/index.js"], "env": { "ZHIPU_API_KEY": "your-zhipu-key" } } } }`

`Exposed Tool`

`$3`

Analyze an image via the GLM vision model suite.

Parameters: -imageUrl(required): Image URL -prompt(optional): Instruction for the analysis, defaults to “Describe this image in detail.” -model (optional): Model id, default glm-4v-plus-temperature(optional): 0-1, default 0.7 -maxTokens (optional): 1-4096, default 1024

Supported Models: -glm-4v-plus: Highest fidelity, best for complex scenes (max 4096 tokens) -glm-4v: Balanced performance/cost (max 2048 tokens) -glm-4v-flash: Fastest inference for simple analysis (max 1024 tokens)

`Project Layout`

`src/ ├── index.ts # MCP server entry ├── config.ts # Model options ├── types.ts # Shared types └── tool.ts # Tool implementation`

`Compatibility`

- Node.js >= 18 -@modelcontextprotocol/sdk` v1