Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash等。为不支持图片理解的 AI 编码模型提供视觉处理能力。
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"luma": {
"env": {
"ZHIPU_API_KEY": "your-api-key",
"MODEL_PROVIDER": "zhipu"
},
"args": [
"-y",
"luma-mcp"
],
"command": "npx"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Multi-Model Visual Understanding MCP Server, GLM-4.6V, DeepSeek-OCR (free), and Qwen3-VL-Flash. Provide visual processing capabilities for AI coding models that do not support image understanding.多模型视觉理解MCP服务器,GLM-4.6V、DeepSeek-OCR(免费)和Qwen3-VL-Flash等。为不支持图片理解的 AI 编码模型提供视觉处理能力。
Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.
npx -y 'luma-mcp' 2>&1 | head -1 && echo "✓ Server started successfully"
After testing, let us know if it worked:
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked luma-mcp against OSV.dev.
Click any tool to inspect its schema.
This server is missing a description.If you've used it, help the community.
Add informationBe the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml
Persistent memory using a knowledge graph
Privacy-first. MCP is the protocol for tool access. We're the virtualization layer for context.
An open-source AI agent that brings the power of Gemini directly into your terminal.
Just a Better Chatbot. Powered by Agent & MCP & Workflows.
MCP Security Weekly
Get CVE alerts and security updates for Luma Mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
多模型视觉理解 MCP 服务器,为不支持原生视觉能力的 AI 助手提供统一的图片分析能力。
English | 中文
image_understand 完成图片理解git clone https://github.com/JochenYang/luma-mcp.git
cd luma-mcp
npm install
npm run build
也可以在 MCP 配置中直接使用:
npx -y luma-mcp
{
"mcpServers": {
"luma": {
"command": "npx",
"args": ["-y", "luma-mcp"],
"env": {
"MODEL_PROVIDER": "zhipu",
"ZHIPU_API_KEY": "your-api-key"
}
}
}
}
把 MODEL_PROVIDER 和对应密钥替换为你实际使用的提供商:
zhipu -> ZHIPU_API_KEYsiliconflow -> SILICONFLOW_API_KEYqwen -> DASHSCOPE_API_KEYvolcengine -> VOLCENGINE_API_KEYhunyuan -> HUNYUAN_API_KEY可选模型覆盖:
MODEL_NAME=doubao-seed-1-6-flash-250828MODEL_NAME=hunyuan-t1-vision-20250916MODEL_NAME=HY-vision-1.5-instruct# Zhipu
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=zhipu --env ZHIPU_API_KEY=your-api-key -- npx -y luma-mcp
# SiliconFlow
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=siliconflow --env SILICONFLOW_API_KEY=your-api-key -- npx -y luma-mcp
# Qwen
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=qwen --env DASHSCOPE_API_KEY=your-api-key -- npx -y luma-mcp
# Volcengine
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=volcengine --env VOLCENGINE_API_KEY=your-api-key --env MODEL_NAME=doubao-seed-1-6-flash-250828 -- npx -y luma-mcp
# Hunyuan
claude mcp add -s user luma-mcp --env MODEL_PROVIDER=hunyuan --env HUNYUAN_API_KEY=your-api-key --env MODEL_NAME=hunyuan-t1-vision-20250916 -- npx -y luma-mcp
{
"mcpServers": {
"luma": {
"command": "node",
"args": ["D:\\codes\\luma-mcp\\build\\index.js"],
"env": {
"MODEL_PROVIDER": "zhipu",
"ZHIPU_API_KEY": "your-api-key"
}
}
}
}
在项目根目录或 .vscode/ 下创建 mcp.json:
{
"mcpServers": {
"luma": {
"command": "npx",
"args": ["-y", "luma-mcp"],
"env": {
"MODEL_PROVIDER": "zhipu",
"ZHIPU_API_KEY": "your-api-key"
}
}
}
}
image_understand参数:
image_source:本地路径、HTTP(S) 图片 URL、Data URIprompt:用户对图片的原始问题示例:
image_understand({
image_source: "./screenshot.png",
prompt: "分析这个页面的布局和主要组件结构",
});
image_understand({
image_source: "./code-error.png",
prompt: "这段代码为什么报错?请给出修复建议",
});
image_understand({
image_source: "https://example.com/ui.png",
prompt: "找出这个界面的可用性问题",
});
| 变量名 | 默认值 | 说明 |
|---|---|---|
MODEL_PROVIDER | zhipu | 模型提供商:zhipu、siliconflow、qwen、volcengine、hunyuan |
MODEL_NAME | 自动选择 | 模型名称 |
BASE_VISION_PROMPT | 内置默认值 | 自定义基础视觉提示词 |
MAX_TOKENS | 8192 | 最大生成 token 数(部分模型有硬上限,详见下方说明) |
[!IMPORTANT] 关于 Token 限制的特别说明:
- SiliconFlow (DeepSeek-OCR): 该模型的总上下文长度(输入+输出)仅为 8192。为了确保图片能正常输入,Luma 已在客户端内部将
MAX_TOKENS硬性限制在 4096 以内。即使你在环境变量中设置了更高的值,也会被截断。- 通用建议: 视觉理解任务通常不需要极长的输出。对于大多数模型,建议将
MAX_TOKENS保持在4096或8192。设置过高(如16384)在处理大图时,可能因总长度超过模型上限而导致400错误。
| 提供商 | 必填环境变量 | 默认模型