Shared MCP gateway with schema deferral, response shielding, and HTTP daemon mode. Replaces 12+ MCP servers with 6 tools — saves ~99% context tokens and ~2.7GB RAM. npm: harshal-mcp-proxy
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"harshal-mcp-proxy": {
"url": "http://localhost:8765/mcp",
"description": "Shared MCP gateway — search, describe, invoke access to all tools"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Custom MCP gateway that sits between your AI clients (pi, VS Code, opencode) and your upstream MCP servers. Combines schema deferral (from mcp-gateway) with response shielding (from tldr) in a single TypeScript server.
This server supports HTTP transport. Be the first to test it — help the community know if it works.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked harshal-mcp-proxy against OSV.dev.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in other
Pi Coding Agent extension (CLI-first) — routes bash/read/grep/find/ls through lean-ctx CLI for strong token savings. Optional MCP bridge can register advanced tools.
Autonomous spec-to-product coding-agent CLI with an MCP server exposing 34 tools over stdio.
97% token reduction for AI coding sessions — zero deps, 21 languages, MCP server
App framework, testing framework, and inspector for MCP Apps.
MCP Security Weekly
Get CVE alerts and security updates for Harshal Mcp Proxy and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Custom MCP gateway that slashes costs: load 6 gateway tools (~375 tokens) instead of 40-70K tokens of upstream MCP server schemas — a ~99.3% reduction — and run one shared daemon instead of a server farm per agent session, saving ~2.7 GB RAM. Combines schema deferral (from mcp-gateway) with response shielding (from tldr) in a single TypeScript server.
Instead of your AI client loading 40-70K tokens of tool schemas from 12+ MCP servers at startup, it loads 6 tool definitions from this proxy (~375 tokens). The proxy then:
Schema deferral — Tools are discovered via BM25 search (gateway.search), full
schemas loaded on demand (gateway.describe), and executed through the proxy
(gateway.invoke). The model never sees schemas it doesn't need.
Response shielding — Every tool response passes through a truncation engine before reaching the model context:
gateway.get_result with pagination, field projection, and text searchShared process elimination — In daemon mode, ONE proxy instance serves ALL your AI clients (pi, VS Code, etc.), eliminating duplicate MCP server processes.
On-demand lazy loading — MCP server processes are no longer started at boot. Instead, tool schemas are loaded from disk-based catalog snapshots at startup. The actual server process is spawned only when you first invoke a tool on that server. After 5 minutes of inactivity, the idle monitor auto-disconnects it — freeing RAM and CPU without losing searchability.
Every AI coding agent (pi session, VS Code MCP extension) traditionally spawns its own complete set of MCP server processes. With 12+ MCP servers in your config, this means:
┌──────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ pi session 1│────►│ harshal-mcp- │────►│ 12 MCP servers │
│ │ │ proxy (stdio) │ │ (1.3 GB) │
├──────────────┤ └──────────────────┘ └──────────────────┘
│ pi session 2│────►┌──────────────────┐ ┌──────────────────┐
│ │ │ harshal-mcp- │────►│ 12 MCP servers │
├──────────────┤ │ proxy (stdio) │ │ (1.3 GB) │
│ VS Code │────►└──────────────────┘ └──────────────────┘
└──────────────┘ ┌──────────────────┐
│ 12 MCP servers │
│ (1.3 GB) │
└──────────────────┘
3 sets × 12 servers = 36 processes, ~4 GB wasted RAM.
With the shared daemon:
┌──────────────┐ ┌────────────────────────┐ ┌──────────────────┐
│ pi session 1│────►│ │ │ │
├──────────────┤ │ harshal-mcp-proxy │────►│ 12 MCP servers │
│ pi session 2│────►│ daemon (HTTP port │ │ (1.3 GB) │
├──────────────┤ │ 8765) │ │ ONE SET │
│ VS Code │────►│ │ │ │
└──────────────┘ └────────────────────────┘ └──────────────────┘
3 clients × 1 daemon = ~10 MCP processes, saves ~2.7 GB RAM.
Each client spawns its own proxy instance via stdio. One consumer at a time.
node dist/index.js
One proxy daemon serves all clients over HTTP. Uses MCP Streamable HTTP trans