Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"velocirag": {
"env": {
"VELOCIRAG_DB": "/path/to/data"
},
"args": [
"mcp"
],
"command": "velocirag"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
Four-layer retrieval fusion powered by ONNX Runtime. No PyTorch. Sub-200ms warm search. Incremental graph updates. MCP-ready.
Run this in your terminal to verify the server starts. Then let us know if it worked — your result helps other developers.
uvx 'velocirag' 2>&1 | head -1 && echo "✓ Server started successfully"
After testing, let us know if it worked:
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
Checked velocirag against OSV.dev.
Click any tool to inspect its schema.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml / data
Persistent memory using a knowledge graph
Manage Supabase projects — databases, auth, storage, and edge functions
Privacy-first. MCP is the protocol for tool access. We're the virtualization layer for context.
🔥 Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
MCP Security Weekly
Get CVE alerts and security updates for io.github.HaseebKhalid1507/velocirag and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
Lightning-fast RAG for AI agents.
Four-layer retrieval fusion powered by ONNX Runtime. No PyTorch. Sub-200ms warm search. Incremental graph updates. MCP-ready.
Most RAG solutions either drag in 2GB+ of PyTorch or limit you to single-layer vector search. VelociRAG gives you four retrieval methods — vector similarity, BM25 keyword matching, knowledge graph traversal, and metadata filtering — fused through reciprocal rank fusion with cross-encoder reranking. All running on ONNX Runtime, no GPU, no API keys. Comes with an MCP server for agent integration, a Unix socket daemon for warm queries, and a CLI that just works.
pip install "velocirag[mcp]"
velocirag index ./my-docs
velocirag mcp
Claude Code — add to .mcp.json in your project root:
{
"mcpServers": {
"velocirag": {
"command": "velocirag",
"args": ["mcp"],
"env": { "VELOCIRAG_DB": "/path/to/data" }
}
}
}
Then open /mcp in Claude Code and enable the velocirag server. If using a virtualenv, use the full path to the binary (e.g. .venv/bin/velocirag).
Claude Desktop — add to claude_desktop_config.json:
{
"mcpServers": {
"velocirag": {
"command": "velocirag",
"args": ["mcp", "--db", "/path/to/data"]
}
}
}
Cursor — add to .cursor/mcp.json:
{
"mcpServers": {
"velocirag": {
"command": "velocirag",
"args": ["mcp", "--db", "/path/to/data"]
}
}
}
from velocirag import Embedder, VectorStore, Searcher
embedder = Embedder()
store = VectorStore('./my-db', embedder)
store.add_directory('./my-docs')
searcher = Searcher(store, embedder)
results = searcher.search('query', limit=5)
pip install velocirag
velocirag index ./my-docs
velocirag search "your query here"
velocirag serve --db ./my-data # start daemon (background)
velocirag search "query" # auto-routes through daemon
velocirag status # check daemon health
velocirag stop # stop daemon
The daemon keeps the ONNX model + FAISS index warm over a Unix socket. First query loads the engine (~1s), subsequent queries return in ~180ms with full 4-layer fusion.
The 4-layer pipeline:
Query → expand (acronyms, variants)
→ [Vector] FAISS cosine similarity (384d, MiniLM-L6-v2 via ONNX)
→ [Keyword] BM25 via SQLite FTS5
→ [Graph] Knowledge graph traversal
→ [Metadata] Structured SQL filters (tags, status, project)
→ RRF Fusion → Cross-encoder rerank → Results
What each layer catches:
| Query type | Vector | Keyword | Graph | Metadata |
|---|---|---|---|---|
| Conceptual ("improve error handling") | ✅ | — | — | — |
| Exact match ("ERR_CONNECTION_REFUSED") | — | ✅ | — | — |
| Connected concepts | — | — | ✅ | — |
| Filtered ("#python status:active") | — | — | — | ✅ |
| Combined ("React state management" |