Llm Server Docs

Name: Llm Server Docs
Author: varunvasudeva1

vllm · by varunvasudeva1

End-to-end documentation to set up your own local & fully private LLM server on Debian. Equipped with chat, web search, RAG, model management, MCP servers, image generation, and TTS.

775 0 tools GitHub PyPI

4 open CVEs

MIT license

Maintained

Last commit 87d ago

Works with most clients

Transport: stdio, sse, http

0 tools

Grade F

Edit this pageView history

AI / ML Education

Step 1

Install in your client

Config is the same across clients — only the file and path differ.

Supported in Claude Desktopstdio, sse, http · Node 18+

Paste into ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "llm-server-docs": {
      "args": [
        "vllm"
      ],
      "command": "uvx"
    }
  }
}

Are you the author?

Add this badge to your README to show your security score and help users find safe servers.

Embed in your READMEAbout badges →

[![MCPpedia Score](https://mcppedia.org/api/badge/llm-server-docs)](https://mcppedia.org/s/llm-server-docs)

Read me

What Llm Server Docs does

End-to-end documentation to set up your own local & fully private LLM server on Debian. Equipped with chat, web search, RAG, model management, MCP servers, image generation, and TTS.

Test This Server

Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.

uvx 'vllm' 2>&1 | head -1 && echo "✓ Server started successfully"

After testing, let us know if it worked:

Scored, not listed

Why this score

Five weighted categories — click any category to see the underlying evidence.

Score breakdown

49/100across 5 weighted dimensions

How we score →

0255075100

−51

Security

Maintenance

Efficiency

Documentation

Compatibility

Categoriesclick a row to see evidence

Security

OSV.dev

4 open41 fixed

CVE-2026-44223low · fixedCVSS 3.1

PYSEC-2026-145

vLLM is an inference and serving engine for large language models (LLMs). From to before 0.20.0, the extract_hidden_states speculative decoding proposer in vLLM returns a tensor with an incorrect shape after the first decode step, causing a RuntimeError that crashes the EngineCore process. The crash is triggered when any request in the batch uses sampling penalty parameters (repetition_penalty, frequency_penalty, or presence_penalty). A single request with a penalty parameter (e.g., "repetition

Affected: >= 0.18.0Fixed in 0.20.0source →

CVE-2026-44222low · fixedCVSS 3.1

vLLM Vulnerable to Remote DoS via Special-Token Placeholders

## Summary This report explains a Token Injection vulnerability in vLLM’s multimodal processing. Unauthenticated, text-only prompts that spell special tokens are interpreted as control. Image and video placeholder sequences supplied without matching data cause vLLM to index into empty grids during input-position computation, raising an unhandled IndexError and terminating the worker or degrading availability. Multimodal paths that rely on `image_grid_thw`/`video_grid_thw` are affected. Severity:

Affected: >= 0.6.1Fixed in 0.20.0source →

CVE-2026-7141low · fixedCVSS 3.1

vLLM makes Use of Uninitialized Resource

A vulnerability was found in vLLM up to 0.19.0. The affected element is the function has_mamba_layers of the file vllm/v1/kv_cache_interface.py of the component KV Block Handler. Performing a manipulation results in uninitialized resource. It is possible to initiate the attack remotely. The attack is considered to have high complexity. The exploitability is described as difficult. The exploit has been made public and could be used. The patch is named 1ad67864c0c20f167929e64c875f5c28e1aad9fd. To

Affected: >= 0Fixed in 0.19.1source →

CVE-2026-34755low · fixedCVSS 3.1

PYSEC-2026-144

vLLM is an inference and serving engine for large language models (LLMs). From 0.7.0 to before 0.19.0, the VideoMediaIO.load_base64() method at vllm/multimodal/media/video.py splits video/jpeg data URLs by comma to extract individual JPEG frames, but does not enforce a frame count limit. The num_frames parameter (default: 32), which is enforced by the load_bytes() code path, is completely bypassed in the video/jpeg base64 path. An attacker can send a single API request containing thousands of co

Affected: >= 0.7.0Fixed in 0.19.0source →

CVE-2026-34753low · fixedCVSS 3.1

vLLM: Server-Side Request Forgery (SSRF) in `download_bytes_from_url `

### Summary A Server Side Request Forgery (SSRF) vulnerability in `download_bytes_from_url` allows any actor who can control batch input JSON to make the vLLM batch runner issue arbitrary HTTP/HTTPS requests from the server, without any URL validation or domain restrictions. This can be used to target internal services (e.g. cloud metadata endpoints or internal HTTP APIs) reachable from the vLLM host. ------ ### Details #### Vulnerable component The vulnerable logic is in the batch runner

Affected: >= 0.16.0Fixed in 0.19.0source →

Help improve this page

This server is missing a description. Tools and install config are also missing.If you've used it, help the community.

Add information

Community

Reviews

Be the first to review

Have you used this server?

Share your experience — it helps other developers decide.

How easy was setup?Did it work reliably?How was the documentation?

Frequently Asked Questions

Is Llm Server Docs safe to use?: Llm Server Docs has 5 known CVEs tracked by MCPpedia. You can verify these on OSV.dev by searching for "vllm". Review the Security section above for details before installing.
How do I install Llm Server Docs?: Llm Server Docs supports copy-paste install configs on its MCPpedia page for Claude Desktop, Cursor, and Claude Code. Scroll to the Quick Install section and select your client.
What AI clients work with Llm Server Docs?: Llm Server Docs is compatible with claude-desktop, cursor, claude-code. It uses stdio and sse and http transport.
Is Llm Server Docs actively maintained?: Llm Server Docs is recently maintained — last commit was 87 days ago. It has 775 GitHub stars.

Similar servers

Others in ai-ml / education

View all →

Sequential Thinking MCP Server98

Dynamic problem-solving through sequential thought chains

86.3k 1

Memory MCP Server98

Persistent memory using a knowledge graph

86.3k 5

Antigravity Workspace Template96

Workspace template + MCP server for Claude Code, Codex CLI, Cursor & Windsurf. Multi-agent knowledge engine (ag-refresh / ag-ask) that turns any codebase into a queryable AI assistant.

1.2k 2

Modelcontextprotocol96

The official MCP server implementation for the Perplexity API Platform

2.2k 4

MCP Security Weekly

Get CVE alerts and security updates for Llm Server Docs and similar servers.

Community

Discussion

Start a conversation

Ask a question, share a tip, or report an issue.

Has anyone used this with Cursor?How do you handle auth?Any alternatives?

Frequently Asked Questions

Is Llm Server Docs safe to use?

Llm Server Docs has 5 known CVEs tracked by MCPpedia. You can verify these on OSV.dev by searching for "vllm". Review the Security section above for details before installing.

How do I install Llm Server Docs?

Llm Server Docs supports copy-paste install configs on its MCPpedia page for Claude Desktop, Cursor, and Claude Code. Scroll to the Quick Install section and select your client.

What AI clients work with Llm Server Docs?

Llm Server Docs is compatible with claude-desktop, cursor, claude-code. It uses stdio and sse and http transport.

Is Llm Server Docs actively maintained?

Llm Server Docs is recently maintained — last commit was 87 days ago. It has 775 GitHub stars.