Safe, self-hosted web grounding for AI agents and crawlers over a stealth browser
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"io-github-dmytrome-groundhog-mcp": {
"command": "<see-readme>",
"args": []
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Safe, self-hosted web grounding for AI agents and crawlers over a stealth browser
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
This server is missing a description. Tools and install config are also missing.If you've used it, help the community.
Add informationBe the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in browser / ai-ml
Persistent memory using a knowledge graph
Dynamic problem-solving through sequential thought chains
An autonomous agent that conducts deep research on any data using any LLM providers
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
MCP Security Weekly
Get CVE alerts and security updates for io.github.dmytrome/groundhog-mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Safe, self-hosted web grounding for AI agents and crawlers. Groundhog is an MCP server that fetches live web pages through a real, stealth-patched Chrome (over CDP) and returns clean Markdown with provenance — without the SSRF holes of plain fetchers and without getting blocked like plain HTTP clients.
agent / crawler ──MCP──▶ Groundhog (read_url) ──CDP──▶ stealth Chrome ──▶ the web
Prerequisite: the stealth browser must be running. The MCP server is a thin client that drives Chrome over CDP, so start the browser first. If it isn't reachable,
read_urlreturns a plain-language message on how to start it, and thestatustool reports reachability. SetGROUNDHOG_AUTO_START_BROWSER=trueto have Groundhog rundocker compose up -dfor you (requires Docker).
docker compose up --build -d
curl -s http://localhost:9222/json/version # CDP is live
Claude Desktop / Cursor / Windsurf (claude_desktop_config.json or equivalent):
{
"mcpServers": {
"groundhog": {
"command": "uvx",
"args": ["groundhog-mcp"],
"env": { "CDP_URL": "http://127.0.0.1:9222" }
}
}
}
uvx fetches groundhog-mcp from PyPI on first run. To run from source instead:
cd mcp && uv sync && uv run groundhog-mcp.
read_url(url, format="markdown", max_tokens=None, query=None, include_hidden=False)Fetches a page and returns clean content plus provenance.
| Key | Meaning |
|---|---|
markdown | Extracted content (article-first, falls back to full text); format may be markdown or text |
title | Page title |
url | The URL you asked for |
final_url | The URL after redirects (re-checked against the SSRF guard) |
fetched_at | UTC ISO-8601 timestamp |
truncated | Whether the content was cut to fit the token budget |
threats | Hidden-text signals detected (signal type + excerpt per node); empty list when none found |
matches | When query is set: ranked passages with heading, offset, and score for citation |
provenance | Content hash, canonical URL, language, word count, and author/date metadata when present |
Because Groundhog renders a real DOM, it can evaluate computed styles. Text invisible to
humans — display:none, visibility:hidden, opacity ≤ 0.05, font-size < 4 px, and
zero-size elements — is stripped by default and each occurrence reported in threats
with its signal type and a short excerpt. Pass include_hidden=True to keep the stripped
text in the output; threats is still populated so you know it was there. Pass query to
replace blunt head-truncation with relevance-ranked passage selection: content is chunked
on markdown structure, ranked by lexical (BM25) relevance, and the top passages within the
token budget are returned; matches gives each passage's heading, character offset, and
score for downstream citation. Ranking runs on sanitized content, so hidden-text injection
payloads cannot influence which passages surface.
status()Reports whether Groundhog can reach the stealth browser. Returns browser_reachable