Is MCP PDF Extractor Server safe to use?

MCP PDF Extractor Server has no known CVEs as of the latest MCPpedia security scan. It does not require authentication, so any local process can connect — keep this in mind in shared environments.

How do I install MCP PDF Extractor Server?

MCP PDF Extractor Server can be installed by cloning its GitHub repository and following the setup instructions in the README.

What can MCP PDF Extractor Server do?

MCP PDF Extractor Server provides 4 tools: extract-to-html, extract-text, list-available-files, get-file-metadata. See the full tools list on the server page for descriptions and parameters.

What AI clients work with MCP PDF Extractor Server?

MCP PDF Extractor Server is compatible with claude-desktop, cursor, claude-code. It uses stdio and sse and http transport.

Is MCP PDF Extractor Server actively maintained?

MCP PDF Extractor Server is less actively maintained — last commit was 305 days ago.

MCP PDF Extractor Server

Name: MCP PDF Extractor Server
Author: RayenMalouche

by RayenMalouche

A Java-based server leveraging Apache Tika to extract content and metadata from files (PDF, DOCX, TXT, etc.) in a local files-to-extract directory. Supports HTML (with CSS styling) and text extraction, file listing, and metadata retrieval via MCP-compliant tools and REST APIs. Built with Spring Boot, Jetty, and MCP SDK.

4 tools GitHub

No known CVEs

No license

Stale

Last commit 305d ago

Works with most clients

Transport: stdio, sse, http

4 tools · ~285 tok

Grade A · 0.1% of 200K ctx

Edit this pageView history

Productivity

Step 1

Install in your client

Config is the same across clients — only the file and path differ.

Supported in Claude Desktopstdio, sse, http · Node 18+

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "mcp-pdf-extractor-server": {
      "command": "<see-readme>",
      "args": []
    }
  }
}

Are you the author?

Add this badge to your README to show your security score and help users find safe servers.

Embed in your READMEAbout badges →

[![MCPpedia Score](https://mcppedia.org/api/badge/mcp-pdf-extractor-server)](https://mcppedia.org/s/mcp-pdf-extractor-server)

Read me

What MCP PDF Extractor Server does

The Tika MCP Extractor Server is a Model Context Protocol (MCP) compliant server that uses Apache Tika to extract content and metadata from files in various formats (e.g., PDF, DOCX, TXT, HTML, images) stored in a files-to-extract directory. It supports conversion to HTML (with optional CSS styling for better readability) or plain text and provides tools to list files and retrieve metadata. Built with Java 23, Spring Boot, Jetty, and the MCP SDK (0.11.0), it integrates with MCP-compliant clients

Test This Server

No automated test available for this server. Check the GitHub README for setup instructions.

Loading README…

Scored, not listed

Why this score

Five weighted categories — click any category to see the underlying evidence.

Score breakdown

58/100across 5 weighted dimensions

How we score →

0255075100

−42

Security

Maintenance

Efficiency

Documentation

Compatibility

Categoriesclick a row to see evidence

Security

OSV.dev

No known CVEs.

No package registry to scan.

Inventory

Tools (4)

Click any tool to inspect its schema.

~285 tokens total

Community

Reviews

Be the first to review

Have you used this server?

Share your experience — it helps other developers decide.

How easy was setup?Did it work reliably?How was the documentation?

Frequently Asked Questions

Is MCP PDF Extractor Server safe to use?: MCP PDF Extractor Server has no known CVEs as of the latest MCPpedia security scan. It does not require authentication, so any local process can connect — keep this in mind in shared environments.
How do I install MCP PDF Extractor Server?: MCP PDF Extractor Server can be installed by cloning its GitHub repository and following the setup instructions in the README.
What can MCP PDF Extractor Server do?: MCP PDF Extractor Server provides 4 tools: extract-to-html, extract-text, list-available-files, get-file-metadata. See the full tools list on the server page for descriptions and parameters.
What AI clients work with MCP PDF Extractor Server?: MCP PDF Extractor Server is compatible with claude-desktop, cursor, claude-code. It uses stdio and sse and http transport.
Is MCP PDF Extractor Server actively maintained?: MCP PDF Extractor Server is less actively maintained — last commit was 305 days ago.

Similar servers

Others in productivity

View all →

Sequential Thinking MCP Server98

Dynamic problem-solving through sequential thought chains

87.9k 1

Memory MCP Server98

Persistent memory using a knowledge graph

87.9k 5

Qmd95

mini cli search engine for your docs, knowledge bases, meeting notes, whatever. Tracking current sota approaches while being all local

27.3k 4

Superlocalmemory MCP Server94

Local-first AI memory with knowledge graphs and hybrid search. 17+ AI tools via MCP. Free.

187 6

MCP Security Weekly

Get CVE alerts and security updates for MCP PDF Extractor Server and similar servers.

Community

Discussion

Start a conversation

Ask a question, share a tip, or report an issue.

Has anyone used this with Cursor?How do you handle auth?Any alternatives?

Edit this pageView history

Productivity

Step 1

Install in your client

Config is the same across clients — only the file and path differ.

Supported in Claude Desktopstdio, sse, http · Node 18+

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "mcp-pdf-extractor-server": {
      "command": "<see-readme>",
      "args": []
    }
  }
}

Are you the author?

Add this badge to your README to show your security score and help users find safe servers.

Embed in your READMEAbout badges →

[![MCPpedia Score](https://mcppedia.org/api/badge/mcp-pdf-extractor-server)](https://mcppedia.org/s/mcp-pdf-extractor-server)

Read me

What MCP PDF Extractor Server does

Test This Server

No automated test available for this server. Check the GitHub README for setup instructions.

README

Tika MCP Extractor Server

Overview

The Tika MCP Extractor Server is a Model Context Protocol (MCP) compliant server that uses Apache Tika to extract content and metadata from files in various formats (e.g., PDF, DOCX, TXT, HTML, images) stored in a files-to-extract directory. It supports conversion to HTML (with optional CSS styling for better readability) or plain text and provides tools to list files and retrieve metadata. Built with Java 23, Spring Boot, Jetty, and the MCP SDK (0.11.0), it integrates with MCP-compliant clients like Claude Desktop or MCP Inspector.

The server exposes four MCP tools:

extract-to-html: Converts file content to HTML (with embedded CSS).
extract-text: Extracts plain text.
list-available-files: Lists files in the directory with details.
get-file-metadata: Retrieves detailed file metadata.

It also provides REST endpoints for testing, including a new endpoint to serve raw HTML directly for browser rendering. All operations are local, requiring no internet access, making it ideal for secure document processing workflows.

Features

File Extraction: Converts file content to HTML (with CSS for readability) or plain text using Apache Tika.
Metadata Extraction: Retrieves metadata like title, author, content type, and creation date.
File Listing: Scans files-to-extract for files, providing size, MIME type, and modification details.
MCP Integration: Four synchronous tools with JSON schema validation.
REST Testing Endpoints:
- GET /api/test/list: Lists available files.
- POST /api/test/extract-html: Extracts file content as JSON with HTML string.
- POST /api/test/extract-text: Extracts file content as plain text in JSON.
- POST /api/test/raw-html: Serves raw HTML directly (renderable in browsers).
- GET/POST /api/health: Checks server and directory status.
CORS Support: Enabled for all REST endpoints for web-based testing.
Configurability: Settings (port, directory, Tika options) via application.properties.
Error Handling: Robust checks for file existence, readability, and parsing errors.
Logging: Console logs with DEBUG support for Tika and PDFBox.

Prerequisites

Java: JDK 23+ (tested with OpenJDK 24.0.2).
Maven: Version 3.6+ for dependency management and building.
Supported File Formats: PDF, DOCX, TXT, HTML, images, etc., handled by Apache Tika 2.9.1 and PDFBox 2.0.29.
Optional: IntelliJ IDEA for development (output indicates IntelliJ usage, but any IDE or CLI works).
Local Files: Place files in files-to-extract directory; no internet required.

Installation

Clone the Repository (if hosted):

git clone https://github.com/RayenMalouche/MCP-PDF-Extractor-server.git
cd MCP-PDF-Extractor-server

Create the Files Directory:

The server reads from files-to-extract (configurable).
Create it:
```
mkdir files-to-extract
```
Add sample files (e.g., sample.pdf, document.docx) for testing.

Build the Project:

Use Maven to compile and resolve dependencies:
```
mvn clean install
```
Outputs executable JAR in target/.

Configuration

Settings are defined in src/main/resources/application.properties:

# Tika MCP Extractor Server Configuration
spring.application.name=TikaExtractorMCPServer

# Server Configuration
server.port=45453

# Tika Configuration
tika.max.string.length=-1
tika.detect.language=false

# File Processing Configuration
files.directory=files-to-extract
files.max.size=52428800

# Logging Configuration
logging.level.org.apache.tika=DEBUG
logging.level.org.apache.pdfbox=DEBUG

spring.application.name: Application name for Spring Boot.
server.port: HTTP port (default: 45453).
tika.max.string.length: Sets max string length for Tika (-1 = unlimited).
tika.detect.language: Disables langu

... View full README on GitHub

Loading README…

Scored, not listed

Why this score

Five weighted categories — click any category to see the underlying evidence.

Score breakdown

58/100across 5 weighted dimensions

How we score →

0255075100

−42

Security

Maintenance

Efficiency

Documentation

Compatibility

Categoriesclick a row to see evidence

Security

OSV.dev

No known CVEs.

No package registry to scan.

Inventory

Tools (4)

Click any tool to inspect its schema.

~285 tokens total

Community

Reviews

Be the first to review

Have you used this server?

Share your experience — it helps other developers decide.

How easy was setup?Did it work reliably?How was the documentation?

Frequently Asked Questions

Is MCP PDF Extractor Server safe to use?: MCP PDF Extractor Server has no known CVEs as of the latest MCPpedia security scan. It does not require authentication, so any local process can connect — keep this in mind in shared environments.
How do I install MCP PDF Extractor Server?: MCP PDF Extractor Server can be installed by cloning its GitHub repository and following the setup instructions in the README.
What can MCP PDF Extractor Server do?: MCP PDF Extractor Server provides 4 tools: extract-to-html, extract-text, list-available-files, get-file-metadata. See the full tools list on the server page for descriptions and parameters.
What AI clients work with MCP PDF Extractor Server?: MCP PDF Extractor Server is compatible with claude-desktop, cursor, claude-code. It uses stdio and sse and http transport.
Is MCP PDF Extractor Server actively maintained?: MCP PDF Extractor Server is less actively maintained — last commit was 305 days ago.