Gemini 2M context cache for Claude Code — persistent; repeat queries ~8x faster, ~4x cheaper.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"io-github-qmediat-gemini-code-context-mcp": {
"command": "<see-readme>",
"args": []
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Gemini 2M context cache for Claude Code — persistent; repeat queries ~8x faster, ~4x cheaper.
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
This server is missing a description. Tools and install config are also missing.If you've used it, help the community.
Add informationBe the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml
Dynamic problem-solving through sequential thought chains
A Model Context Protocol (MCP) server and CLI that provides tools for agent use when working on iOS and macOS projects.
The official Python SDK for Model Context Protocol servers and clients
An open-source AI agent that brings the power of Gemini directly into your terminal.
MCP Security Weekly
Get CVE alerts and security updates for io.github.qmediat/gemini-code-context-mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
@qmediat.io/gemini-code-context-mcpGive Claude Code persistent memory of your codebase, backed by Gemini's 2M-token context. Turn repeat code-review queries into second-scale responses — same codebase, same answers, a fraction of the cost.
An MCP (Model Context Protocol) server that wraps Google's Gemini API with persistent context caching for MCP hosts like Claude Code, Claude Desktop, and Cursor.
| jamubc/gemini-mcp-tool | @qmediat.io/gemini-code-context-mcp | |
|---|---|---|
| Maintenance | Unmaintained on npm since 2025-07 (v1.1.4); last commit on main 2025-07-23; no maintainer reply on 2026 issues (#49/#62/#64 at time of writing) | Actively maintained |
| Default model | Hardcoded gemini-2.5-pro (main) — no runtime override | Dynamic latest-pro alias — resolves against your API key tier at startup |
| Backend | Shells out to gemini CLI (subprocess per call) | Direct @google/genai SDK |
| Repeat queries | No caching layer — each call re-tokenises referenced files | Files API + Context Cache — repeat queries reuse the indexed codebase; cached input tokens billed at ~25 % of the uncached rate |
| Coding delegation | Prompt-injection changeMode (OLD/NEW format in system text) | Native thinkingConfig + optional codeExecution |
| Auth | Inherits gemini CLI auth (browser OAuth via gemini auth login, or env var) | 3-tier: Vertex ADC / credentials file (chmod 0600 atomic write) / env var (+ warning) |
| Cost control | — | Daily budget cap in USD (GEMINI_DAILY_BUDGET_USD) |
| Dead deps | 5 unused packages (ai, chalk, d3-shape, inquirer, prismjs) | Zero dead deps |
Comparison points reference
jamubc/gemini-mcp-toolas seen on its GitHubmainbranch (most current snapshot, last commit 2025-07-23). The published npm v1.1.4 is ~9 months older and differs in a few specifics — default model isgemini-3.1-pro-previewthere instead ofgemini-2.5-pro, and only 3 of the 5 deps are dead in that tarball (chalkandinquirerstill imported). The structural claims (hardcoded model, no caching,geminiCLI backend, unreleased improvements stuck behind an abandoned npm registry entry) hold for both.
# 1. Secure credential setup (your key never touches ~/.claude.json)
npx @qmediat.io/gemini-code-context-mcp init
# 2. Paste this into ~/.claude.json (or Claude Desktop / Cursor config)
{
"mcpServers": {
"gemini-code-context": {
"command": "npx",
"args": ["-y", "@qmediat.io/gemini-code-context-mcp"],
"env": { "GEMINI_CREDENTIALS_PROFILE": "default" }
}
}
}
# 3. Restart your MCP host. Ask Claude:
# > Use gemini-code-context.ask to summarize this codebase
First query: ~45 s – 2 min depending on workspace size (scan + Files API upload + cache build). Every follow-up (cache hit): ~13–16 s on latest-pro-thinking with thinkingLevel: LOW, faster on latest-flash. Measured on vitejs/vite@main's packages/vite/ (~670 k tokens, 451 files): cold 125 s, warm ~14 s, $0.60 cached vs $2.35 inline per query (~8× faster, ~4× cheaper on cache hit). Thinking budget dominates warm latency — HIGH thinking adds 15–45 s per call on top of the cache-hit floor. Raw ledger reproducible via the status tool.
See docs/getting-started.md for a 3-minute walkthrough.
| Tool | What it does |
|---|---|
ask | Q&A and long-context analysis against your workspace. Eager — uploads the whole repo to G |