Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"iris-eval": {
"args": [
"@iris-eval/mcp-server"
],
"command": "npx"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Know whether your AI agents are actually good enough to ship. Iris is an open-source MCP server that scores output quality, catches safety failures, and enforces cost budgets across all your agents. Any MCP-compatible agent discovers and uses it automatically — no SDK, no code changes.
Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.
npx -y '@iris-eval/mcp-server' 2>&1 | head -1 && echo "✓ Server started successfully"
After testing, let us know if it worked:
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked @iris-eval/mcp-server against OSV.dev.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml / analytics
Persistent memory using a knowledge graph
Dynamic problem-solving through sequential thought chains
Workspace template + MCP server for Claude Code, Codex CLI, Cursor & Windsurf. Multi-agent knowledge engine (ag-refresh / ag-ask) that turns any codebase into a queryable AI assistant.
MCP Server for GCP environment for interacting with various Observability APIs.
MCP Security Weekly
Get CVE alerts and security updates for io.github.iris-eval/mcp-server and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Know whether your AI agents are actually good enough to ship. Iris is an open-source MCP server that scores output quality, catches safety failures, and enforces cost budgets across all your agents. Any MCP-compatible agent discovers and uses it automatically — no SDK, no code changes.

Your agents are running in production. Infrastructure monitoring sees 200 OK and moves on. It has no idea the agent just:
Iris evaluates all of it.
| Trace Logging | Hierarchical span trees with per-tool-call latency, token usage, and cost in USD. Stored in SQLite, queryable instantly. |
| Output Evaluation | 13 built-in rules across 4 categories: completeness, relevance, safety, cost. PII detection (10 patterns: SSN, credit card, phone, email, IBAN, DOB, MRN, IP, API key, passport), prompt injection (13 patterns), stub-output detection, hallucination markers (17 hedging phrases + fabricated-citation heuristic). Add custom rules with Zod schemas. |
| Cost Visibility | Aggregate cost across all agents over any time window. Set budget thresholds. Get flagged when agents overspend. |
| Web Dashboard | Real-time dark-mode UI with trace visualization, eval results, and cost breakdowns. |
Requires Node.js 20 or later. Check with node --version.
Add Iris to your MCP config. Works with Claude Desktop, Cursor, Windsurf, and any MCP-compatible agent.
{
"mcpServers": {
"iris-eval": {
"command": "npx",
"args": ["@iris-eval/mcp-server"]
}
}
}
That's it. Your agent discovers Iris and starts logging traces automatically.
Iris ships with a real-time web dashboard showing traces, eval results, cost breakdowns, and rule pass-rates. It's off by default so the MCP server stays lightweight — flip it on with a flag.
{
"mcpServers": {
"iris-eval": {
"command": "npx",
... [View full README on GitHub](https://github.com/iris-eval/mcp-server#readme)