Rust MCP server for denoised web search — fetch, clean, and rerank web content for AI agents.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"webshift": {
"args": [
"--default-backend",
"searxng"
],
"command": "mcp-webshift"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
WebShift is a Rust library and MCP server that shifts noisy web pages into clean, right-sized text for LLM consumption.
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in search
Web and local search using Brave Search API
Production ready MCP server with real-time search, extract, map & crawl.
Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors
mini cli search engine for your docs, knowledge bases, meeting notes, whatever. Tracking current sota approaches while being all local
MCP Security Weekly
Get CVE alerts and security updates for io.github.x-monk/webshift and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
WebShift is a Rust library and MCP server that shifts noisy web pages into clean, right-sized text for LLM consumption.
Raw HTML is mostly junk: scripts, ads, navigation menus, cookie banners, tracking pixels. Feeding it directly to an LLM floods the context window with tens of thousands of useless tokens and leaves no room for reasoning. WebShift strips all that noise, sterilizes the text, and enforces strict size budgets so the model receives only the content that matters.
Depending on the features you enable, WebShift can be four things:
| Use case | Crate | Feature flags | What it does |
|---|---|---|---|
| HTML denoiser | webshift | default-features = false | clean() — pure Rust HTML-to-text pipeline. Strips noise elements, sterilizes Unicode/BiDi, collapses whitespace. Zero network, zero config. Drop into any Rust project that processes web content for LLMs. |
| HTML text rewriter | webshift | features = ["text-map"] | extract_text_nodes() + replace_text_nodes() — extract individual text nodes from HTML, manipulate them (translate, rewrite, simplify), and rebuild the HTML with structure intact. Tags, attributes, and links are never touched. |
| Web content client | webshift | default or features = ["llm"] | fetch() + query() — streaming HTTP fetcher with size caps, 8 search backends, BM25 reranking, optional LLM query expansion and summarization. Full pipeline from search query to structured results. |
| MCP server | webshift-mcp | all features | Native binary (mcp-webshift) that exposes webshift_query, webshift_fetch, and webshift_onboarding over MCP stdio. Single static binary, zero runtime dependencies. |
text-map does preserve the DOM structure for text rewriting use cases.)Question
|
+- (optional) LLM query expansion -> multiple search variants
|
+- Search via backend (SearXNG, Brave, Tavily, Exa, SerpAPI, Google, Bing, HTTP)
|
+- Deduplicate + filter binary URLs
|
+- Streaming fetch with per-page size cap
|
+- HTML cleaning
... [View full README on GitHub](https://github.com/x-monk/webshift#readme)