Shared MCP gateway with schema deferral, response shielding, and HTTP daemon mode. Replaces 12+ MCP servers with 6 tools — saves ~99% context tokens and ~2.7GB RAM. npm: harshal-mcp-proxy
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"harshal-mcp-proxy": {
"url": "http://localhost:8765/mcp",
"description": "Shared MCP gateway — search, describe, invoke access to all tools"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Custom MCP gateway that sits between your AI clients (pi, VS Code, opencode) and your upstream MCP servers. Combines schema deferral (from mcp-gateway) with response shielding (from tldr) in a single TypeScript server.
This server supports HTTP transport. Be the first to test it — help the community know if it works.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked harshal-mcp-proxy against OSV.dev.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in other
MCP server for Spanning Cloud Backup — M365/GWS/Salesforce backups, restores, audit.
AI agent control of 3D printers — 432 tools for OctoPrint, Moonraker, Bambu, Prusa, Elegoo
MCP server for Kaseya Autotask PSA — companies, tickets, projects, time entries, and more.
On-chain provenance lookup for AnchorRegistry. Resolve AR-IDs, hashes, and full trees. Authless.
MCP Security Weekly
Get CVE alerts and security updates for Harshal Mcp Proxy and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Custom MCP gateway that sits between your AI clients (pi, VS Code, opencode) and your upstream MCP servers. Combines schema deferral (from mcp-gateway) with response shielding (from tldr) in a single TypeScript server.
Instead of your AI client loading 40-70K tokens of tool schemas from 12+ MCP servers at startup, it loads 6 tool definitions from this proxy (~375 tokens). The proxy then:
Schema deferral — Tools are discovered via BM25 search (gateway.search), full
schemas loaded on demand (gateway.describe), and executed through the proxy
(gateway.invoke). The model never sees schemas it doesn't need.
Response shielding — Every tool response passes through a truncation engine before reaching the model context:
gateway.get_result with pagination, field projection, and text searchShared process elimination — In daemon mode, ONE proxy instance serves ALL your AI clients (pi, VS Code, etc.), eliminating duplicate MCP server processes.
On-demand lazy loading — MCP server processes are no longer started at boot. Instead, tool schemas are loaded from disk-based catalog snapshots at startup. The actual server process is spawned only when you first invoke a tool on that server. After 5 minutes of inactivity, the idle monitor auto-disconnects it — freeing RAM and CPU without losing searchability.
Every AI coding agent (pi session, VS Code MCP extension) traditionally spawns its own complete set of MCP server processes. With 12+ MCP servers in your config, this means:
┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ pi session 1│────►│ harshal-mcp- │────►│ 12 MCP servers │
│ │ │ proxy (stdio) │ │ (1.3 GB) │
├──────────────┤ └──────────────────┘ └──────────────────┘
│ pi session 2│────►┌──────────────────┐ ┌──────────────────┐
│ │ │ harshal-mcp- │────►│ 12 MCP servers │
├──────────────┤ │ proxy (stdio) │ │ (1.3 GB) │
│ VS Code │────►└──────────────────┘ └──────────────────┘
└──────────────┘ ┌──────────────────┐
│ 12 MCP servers │
│ (1.3 GB) │
└──────────────────┘
3 sets × 12 servers = 36 processes, ~4 GB wasted RAM.
With the shared daemon:
┌──────────────┐ ┌────────────────────────┐ ┌──────────────────┐
│ pi session 1│────►│ │ │ │
├──────────────┤ │ harshal-mcp-proxy │────►│ 12 MCP servers │
│ pi session 2│────►│ daemon (HTTP port │ │ (1.3 GB) │
├──────────────┤ │ 8765) │ │ ONE SET │
│ VS Code │────►│ │ │ │
└──────────────┘ └────────────────────────┘ └──────────────────┘
3 clients × 1 daemon = ~10 MCP processes, saves ~2.7 GB RAM.
Each client spawns its own proxy instance via stdio. One consumer at a time.
node dist/index.js
One proxy daemon serves all clients over HTTP. Uses MCP Streamable HTTP transport (JSON-RPC 2.0).
node dist/index.js --daemon
# or with custom port:
node dist/index.js --port 8765