MCP server for extracting text, images, tables, links, annotations, and metadata from PDF files.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"pdf-reader": {
"args": [
"pdf-insight-mcp"
],
"command": "uvx"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
一个用于读取和分析 PDF 文件的 MCP 服务器。它可以为支持 MCP(Model Context Protocol)的客户端提供 PDF 文本、页面图片、表格、链接、批注、目录、元数据和基础文本统计。
Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.
uvx 'pdf-insight-mcp' 2>&1 | head -1 && echo "✓ Server started successfully"
After testing, let us know if it worked:
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked pdf-insight-mcp against OSV.dev.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in other
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
Pi Coding Agent extension (CLI-first) — routes bash/read/grep/find/ls through lean-ctx CLI for strong token savings. Optional MCP bridge can register advanced tools.
97% token reduction for AI coding sessions — zero deps, 21 languages, MCP server
Autonomous spec-to-product coding-agent CLI with an MCP server exposing 34 tools over stdio.
MCP Security Weekly
Get CVE alerts and security updates for io.github.Xvvln/pdf-reader-mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
一个用于读取和分析 PDF 文件的 MCP 服务器。它可以为支持 MCP(Model Context Protocol)的客户端提供 PDF 文本、页面图片、表格、链接、批注、目录、元数据和基础文本统计。
A PDF-focused MCP server for extracting text, rendered pages, tables, links, annotations, outlines, metadata, and text statistics from PDF files.
pdf-reader-mcpio.github.Xvvln/pdf-reader-mcppdf-insight-mcppdf-reader-mcp and pdf-insight-mcppdf-reader-mcp is the project name. The PyPI package is published as pdf-insight-mcp because the pdf-reader-mcp package name is not available on PyPI.
| Tool | What it does |
|---|---|
get_pdf_info | Read document metadata, page count, file size, and encryption status. |
read_pdf_as_text | Extract text from selected pages with page and character limits. |
read_pdf_as_images | Render selected pages as base64-encoded images. |
get_pdf_outline | Read bookmarks and outline entries. |
search_pdf_text | Search text and return per-match page context. |
extract_pdf_tables | Extract structured tables when PyMuPDF can detect them. |
extract_pdf_images | Extract embedded PDF images. |
get_pdf_page_info | Inspect one page's size, text, images, links, and rotation. |
extract_pdf_links | Extract external URLs and internal page jumps. |
get_pdf_annotations | Read comments, highlights, and annotation metadata. |
get_pdf_text_stats | Compute text, line, paragraph, and scan-likelihood stats. |
compare_pdf_pages | Compare text similarity between two pages. |
Install uv if you do not already have it:
curl -LsSf https://astral.sh/uv/install.sh | sh
Run the server directly from PyPI:
uvx pdf-insight-mcp
Or install it first:
python -m pip install pdf-insight-mcp
pdf-reader-mcp
Use the published PyPI package:
{
"mcpServers": {
"pdf-reader": {
"command": "uvx",
"args": ["pdf-insight-mcp"]
}
}
}
Use a local checkout for development:
{
"mcpServers": {
"pdf-reader": {
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/pdf-reader-mcp",
"run",
"pdf-reader-mcp"
]
}
}
}
Replace /absolute/path/to/pdf-reader-mcp with the absolute path to this repository on your machine.
Ask your MCP client to call tools with an absolute PDF path. Example requests:
Read /Users/me/Documents/report.pdf as text.
Search /Users/me/Documents/report.pdf for "baseline characteristics".
Render pages 1-3 of /Users/me/Documents/report.pdf as images.
Extract links and annotations from /Users/me/Documents/review.pdf.
For large PDFs, prefer small page ranges first. For scanned or layout-sensitive PDFs, use read_pdf_as_images with a small pages range and moderate dpi.
read_pdf_as_text defaults to at most 50 pages and 200000 returned characters.read_pdf_as_images rejects requests above 20 pages.read_pdf_as_images defaults to an overall image payload cap of about 20 MB.extract_pdf_images returns at most 20 embedded images but reports the actual detected total.Install dependencies:
uv sync --extra dev
Run tests:
uv run pytest -q
Build the package:
uv build
uvx twine check dist/*
Run the local server:
uv run pdf-reader-mcp
Releases are published through GitHub Actions.
Before the first release, configure PyPI Trusted Publishing with:
PyPI project name: pdf-insight-mcp
Owner: Xvvln
Repository name: pdf-reader-mcp
Workflow filename: pub
... [View full README on GitHub](https://github.com/xvvln/pdf-reader-mcp#readme)