io.github.dorukardahan/twitterapi-io-mcp
Offline access to TwitterAPI.io docs for AI assistants. 58 endpoints, 32 pages, 24 blog posts.
PDF-to-Markdown router. Per-page backend selection + confidence scoring for RAG ingestion.
{
"mcpServers": {
"io-github-nameetp-pdfmux": {
"command": "<see-readme>",
"args": []
}
}
}No install config available. Check the server's README for setup instructions.
Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
PDF-to-Markdown router. Per-page backend selection + confidence scoring for RAG ingestion.
Is it safe?
No package registry to scan.
No authentication — any process on your machine can connect.
License not specified.
Is it maintained?
Commit history unknown.
Will it work with my client?
Transport: stdio. Works with Claude Desktop, Cursor, Claude Code, and most MCP clients.
No automated test available for this server. Check the GitHub README for setup instructions.
This server is missing a description. Tools and install config are also missing.If you've used it, help the community.
Add informationNo known vulnerabilities.
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Offline access to TwitterAPI.io docs for AI assistants. 58 endpoints, 32 pages, 24 blog posts.
Temporal memory for AI with decay and reinforcement. Two-layer storage (JSONL + Markdown).
MCP server that fetches YouTube video transcripts and optionally summarizes them
Local academic paper MCP server — 9-source search, multi-source download, AI analysis, translation, citation graph, code-based paper recommendation
MCP Security Weekly
Get CVE alerts and security updates for io.github.NameetP/pdfmux and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Universal PDF extraction orchestrator. Routes each page to the best backend, audits the output, re-extracts failures. 5 rule-based extractors + BYOK LLM fallback. One CLI. One API. Zero config.
PDF ──> pdfmux router ──> best extractor per page ──> audit ──> re-extract failures ──> Markdown / JSON / chunks
|
├─ PyMuPDF (digital text, 0.01s/page)
├─ OpenDataLoader (complex layouts, 0.05s/page)
├─ RapidOCR (scanned pages, CPU-only)
├─ Docling (tables, 97.9% TEDS)
├─ Surya (heavy OCR fallback)
└─ YOUR LLM (Gemini / Claude / GPT-4o / Ollama — BYOK via 5-line YAML)
pip install pdfmux
That's it. Handles digital PDFs out of the box. Add backends for harder documents:
pip install "pdfmux[ocr]" # RapidOCR — scanned/image pages (~200MB, CPU-only)
pip install "pdfmux[tables]" # Docling — table-heavy docs (~500MB)
pip install "pdfmux[opendataloader]" # OpenDataLoader — complex layouts (Java 11+)
pip install "pdfmux[llm]" # LLM fallback — Gemini, Claude, GPT-4o, Ollama
pip install "pdfmux[all]" # everything
Requires Python 3.11+.
# zero config — just works
pdfmux convert invoice.pdf
# invoice.pdf -> invoice.md (2 pages, 95% confidence, via pymupdf4llm)
# RAG-ready chunks with token limits
pdfmux convert report.pdf --chunk --max-tokens 500
# cost-aware extraction with budget cap
pdfmux convert report.pdf --mode economy --budget 0.50
# schema-guided structured extraction (5 built-in presets)
pdfmux convert invoice.pdf --schema invoice
# BYOK any LLM for hardest pages
pdfmux convert scan.pdf --llm-provider claude
# batch a directory
pdfmux convert ./docs/ -o ./output/
import pdfmux
# text -> markdown
text = pdfmux.extract_text("report.pdf")
# structured data -> dict with tables, key-values, metadata
data = pdfmux.extract_json("report.pdf")
# RAG chunks -> list of dicts with token estimates
chunks = pdfmux.chunk("report.pdf", max_tokens=500)
┌─────────────────────────────┐
│ Segment Detector │
│ text / tables / images / │
│ formulas / headers per page │
└─────────────┬───────────────┘
│
┌────────────────────────────────────────┐
│ Router Engine │
│ │
│ economy ── balanced ── premium │
│ (minimize $) (default) (max quality)│
│ budget caps: --budget 0.50 │
└────────────────────┬───────────────────┘
│
┌──────────┬──────────┬────────┴────────┬──────────┐
│ │ │ │ │
PyMuPDF OpenData RapidOCR Docling LLM
digital Loader scanned tables (BYOK)
0.01s/pg complex CPU-only 97.9% any provider
layouts TEDS
│
... [View full README on GitHub](https://github.com/NameetP/pdfmux#readme)