Smart LLM routing across every major provider via one OpenAI-shape API.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"com-gammainfra-mcp-server": {
"command": "<see-readme>",
"args": []
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Model Context Protocol (MCP) server for GammaInfra — intelligent LLM routing across every major provider via one OpenAI-shape API.
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml / maps
Dynamic problem-solving through sequential thought chains
Persistent memory using a knowledge graph
Workspace template + MCP server for Claude Code, Codex CLI, Cursor & Windsurf. Multi-agent knowledge engine (ag-refresh / ag-ask) that turns any codebase into a queryable AI assistant.
The official MCP server implementation for the Perplexity API Platform
MCP Security Weekly
Get CVE alerts and security updates for com.gammainfra/mcp-server and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Model Context Protocol (MCP) server for GammaInfra — intelligent LLM routing across every major provider via one OpenAI-shape API.
Drop this server into Claude Code, Claude Desktop, Cursor, Cline, Continue, or any MCP-compatible host, and your agent gets direct tool access to:
chat_completions — call any supported model (or gammainfra/auto for smart routing) with cost, latency, and quality controls. Routing metadata (which provider served, exact cost in USD, fallback chain) is returned as a structured routing_meta field.list_models — full model catalog with pricing and capability flags.get_balance — managed + BYOK balances.get_status — overall + per-provider health, 24h request count.The server runs via npx — no manual install needed. The first invocation downloads and caches the package.
claude mcp add gammainfra \
--env GAMMAINFRA_API_KEY=sk-gammainfra-... \
-- npx -y @gammainfra/mcp-server
Or edit ~/.claude.json and add to the mcpServers block:
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Restart Claude Desktop. The "GammaInfra" server should appear in the tools menu.
Edit ~/.cursor/mcp.json:
{
"mcpServers": {
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." }
}
}
}
Open Cline's settings (gear icon → MCP Servers tab) and add:
{
"gammainfra": {
"command": "npx",
"args": ["-y", "@gammainfra/mcp-server"],
"env": { "GAMMAINFRA_API_KEY": "sk-gammainfra-..." },
"disabled": false
}
}
| Var | Required | Default | Description |
|---|---|---|---|
GAMMAINFRA_API_KEY | yes | — | Your GammaInfra API key, format sk-gammainfra-{32_chars}. |
GAMMAINFRA_BASE_URL | no | https://api.gammainfra.com/v1 | Override for staging/dev. |
chat_completionsSend a chat completion request and receive the model response plus routing metadata.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
model | string | yes | gammainfra/auto for smart routing, gammainfra/fast/gammainfra/cheap for tier shortcuts, or pin a specific model like openai/gpt-5-mini. |
messages | array | yes | OpenAI-shape conversation messages. |
temperature | number | no | 0..2. |
max_tokens | int | no | |
max_completion_tokens | int | no | GPT-5 family requires this instead of max_tokens. |
cost_quality | float | no | 0.0..1.0 continuous dial. Sent as X-GammaInfra-Cost-Quality. |
max_latency_ms | int | no | 60..600000. Caps total wall-clock incl. fallback retries. Also enforced client-side as a hard request abort. |
preference | string | no | quality, cost, or latency. |
region | string | no | us, eu, apac, or specific AWS region. |
tools, tool_choice, response_format, top_p, frequency_penalty, presence_penalty | various | no | Standard OpenAI fields, forwarded as-is. |
Returns: `{ response: <OpenAI re