Local OCR & image analysis via Apple Vision — no cloud, no API keys, ~97% fewer tokens on PDFs.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"macos-vision-mcp": {
"command": "macos-vision-mcp"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Local OCR & image analysis for any MCP client — private, offline, no API keys.
Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.
npx -y 'macos-vision-mcp' 2>&1 | head -1 && echo "✓ Server started successfully"
After testing, let us know if it worked:
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked macos-vision-mcp against OSV.dev.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml / developer-tools
Persistent memory using a knowledge graph
Dynamic problem-solving through sequential thought chains
Read, write, and manage files on the local filesystem
A Model Context Protocol (MCP) server and CLI that provides tools for agent use when working on iOS and macOS projects.
MCP Security Weekly
Get CVE alerts and security updates for io.github.woladi/macos-vision-mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Cut document token costs by ~97% with local, private, offline OCR for any MCP client — no API keys, no uploads.
Pre-extracts text and image data locally before your AI ever sees it — cutting token usage by ~97% on real documents and returning structured paragraphs, lines, and bounding boxes so the model can reconstruct the document into Markdown, HTML, DOCX, or any other format. Files never leave your Mac: no cloud API, no API keys, no network requests.
How the ~97% is measured: a 44-page scanned PDF sent as page images costs ~73,500 tokens; the same file run through
analyze_documentreturns ~2,400 tokens of extracted text and structure (raw page-image tokens vs. extracted-text tokens). Your numbers vary with page density and tokenizer — treat 97% as the order of magnitude, not a guarantee.
Contents: Quick Start · What you get · Why it's different · Available Tools · Usage · Example workflows · Configuration · Privacy layer
npm install — powered by Apple Vision Framework, same engine as Live Text in Photos.app.❌ Without macos-vision-mcp:
✅ With macos-vision-mcp:
Most OCR options for LLMs either ship your documents to a cloud vision API or make you stand up and tune your own engine. This runs on Apple's on-device Vision framework — the same engine behind Live Text in Photos.app — so extraction is free, private, and instant.
| | macos-vision-mcp | Cloud vision OCR (GPT-4o, Google Vision, M