Is Houtini Lm safe to use?

Houtini Lm has no known CVEs as of the latest MCPpedia security scan. It does not require authentication, so any local process can connect — keep this in mind in shared environments. Licensed under MIT.

How do I install Houtini Lm?

Houtini Lm supports copy-paste install configs on its MCPpedia page for Claude Desktop, Cursor, and Claude Code. Scroll to the Quick Install section and select your client.

What AI clients work with Houtini Lm?

Houtini Lm is compatible with Claude Desktop, Cursor, Claude Code, and most MCP clients that support stdio transport. It uses stdio and sse and http transport.

Is Houtini Lm actively maintained?

Houtini Lm is actively maintained — last commit was 9 days ago. It has 56 GitHub stars.

Houtini Lm

MCP server that saves Claude Code tokens by delegating bounded tasks to local or cloud LLMs. 93% token savings benchmarked. Works with LM Studio, Ollama, vLLM, DeepSeek, Groq, Cerebras.

ActiveNot tested

AI / ML

★ 56↓ 65/wknpm GitHub

59CNo CVEsactiveMIT

Quick Install

{
  "mcpServers": {
    "houtini-lm": {
      "env": {
        "LM_STUDIO_URL": "http://localhost:1234"
      },
      "args": [
        "-y",
        "@houtini/lm"
      ],
      "command": "npx"
    }
  }
}

Setup guide

Are you the author?

Add this badge to your README to show your security score and help users find safe servers.

Embed in your READMEAbout badges →

[![MCPpedia Score](https://mcppedia.org/api/badge/houtini-lm)](https://mcppedia.org/s/houtini-lm)

Should you use this server?

✓

Is it safe?

No known CVEs for @houtini/lm.

No authentication — any process on your machine can connect.

MIT. View license →

✓

Is it maintained?

Last commit 9 days ago. 56 stars. 65 weekly downloads.

Will it work with my client?

Transport: stdio, sse, http. Works with Claude Desktop, Cursor, Claude Code, and most MCP clients.

README

@houtini/lm Houtini LM - Save Tokens by Offloading Tasks from Claude Code to Your Local LLM Server (LM Studio / Ollama) or a Cloud API

Quick Navigation

How it works | Token savings | Quick start | What gets offloaded | Tools | Model routing | Configuration | Compatible endpoints

I built this because I kept leaving Claude Code running overnight on big refactors and the token bill was painful. A huge chunk of that spend goes on bounded tasks any decent model handles fine - generating boilerplate, code review, commit messages, format conversion. Stuff that doesn't need Claude's reasoning or tool access.

Houtini LM connects Claude Code to a local LLM on your network - or any OpenAI-compatible API. Claude keeps doing the hard work - architecture, planning, multi-file changes - and offloads the grunt work to whatever cheaper model you've got running. Free. No rate limits. Private.

I wrote a full walkthrough of why I built this and how I use it day to day.

How it works

Claude Code (orchestrator)
   |
   |-- Complex reasoning, planning, architecture --> Claude API (your tokens)
   |
   +-- Bounded grunt work --> houtini-lm --HTTP/SSE--> Your local LLM (free)
       . Boilerplate & test stubs          Qwen, Llama, Nemotron, GLM...
       . Code review & explanations        LM Studio, Ollama, vLLM, llama.cpp
       . Commit messages & docs            DeepSeek, Groq, Cerebras (cloud)
       . Format conversion
       . Mock data & type definitions
       . Embeddings for RAG pipelines

Claude's the architect. Your local model's the drafter. Claude QAs everything.

Token savings — benchmarked

We built a benchmark using real source files (581–2022 lines of TypeScript) across realistic delegation patterns. The savings come from context avoidance — when Claude delegates, it never reads the source file into its context window.

Task	Claude direct	Delegated	Saved
Code review (1352 lines)	14,466 tok	769 tok	95%
Architecture review (2022 lines)	20,014 tok	983 tok	95%
External repo review (581 lines)	5,344 tok	741 tok	86%
Code explanation (833 lines)	8,678 tok	744 tok	91%

93.3% net token savings across the session. Without delegation, Claude reads 14,000 tokens of source code then generates a 500-token review. With delegation, Claude sends a ~250 token tool call and reads back a ~500 token summary. The source file never enters Claude's context.

Small tasks (quick answers, commit messages) don't save tokens — the ~250 token MCP overhead dominates. But for anything involving reading and analysing files, which is the majority of real coding sessions, delegation pays for itself immediately.

Run the benchmark against your own setup: `LM_STUDIO_URL=http://your-server:1234 node benc

... View full README on GitHub

Test This Server

This server supports HTTP transport. Be the first to test it — help the community know if it works.

Score Breakdown

MCPpedia Score

Click each category to see evidence

59C

Last scored 7h ago

How we score →

Security

23/30

No known vulnerabilities.

✓

Known CVEs — No known CVEs for @houtini/lmcheck OSV.dev

○

Tool safety — No tools to analyze

○

Tool poisoning — No tools to analyze

○

Injection vectors — No tools to analyze

✓

Tool stability — Tool definitions stable

○

Dependency health — found on deps.dev, recently updated

✓

License — License: MIT

○

Authentication — No authentication required

○

Repository — active repo

CVEs checked daily via OSV.dev. Score algorithm is open source.

Reviews

Have you used this server?

Share your experience — it helps other developers decide.

How easy was setup?Did it work reliably?How was the documentation?

Frequently Asked Questions

Is Houtini Lm safe to use?: Houtini Lm has no known CVEs as of the latest MCPpedia security scan. It does not require authentication, so any local process can connect — keep this in mind in shared environments. Licensed under MIT.
How do I install Houtini Lm?: Houtini Lm supports copy-paste install configs on its MCPpedia page for Claude Desktop, Cursor, and Claude Code. Scroll to the Quick Install section and select your client.
What AI clients work with Houtini Lm?: Houtini Lm is compatible with Claude Desktop, Cursor, Claude Code, and most MCP clients that support stdio transport. It uses stdio and sse and http transport.
Is Houtini Lm actively maintained?: Houtini Lm is actively maintained — last commit was 9 days ago. It has 56 GitHub stars.

Similar servers

View all →

Sequential Thinking MCP Server

Official

No CVEs98A

Dynamic problem-solving through sequential thought chains

1 toolsstdio83.8k111.0k/wk1d ago

XcodeBuildMCP

Official

No CVEs97A

A Model Context Protocol (MCP) server and CLI that provides tools for agent use when working on iOS and macOS projects.

2 toolsremote5.2k46.6k/wk1d ago

Python Sdk

Official

No CVEs95A

The official Python SDK for Model Context Protocol servers and clients

1 toolsremote22.6k156.5k/wk21h ago

Gemini Cli

No CVEs95A

An open-source AI agent that brings the power of Gemini directly into your terminal.

8 toolsremote101.3k814.6k/wk12h ago

MCP Security Weekly

Get CVE alerts and security updates for Houtini Lm and similar servers.

Discussion

Start a conversation

Ask a question, share a tip, or report an issue.

Has anyone used this with Cursor?How do you handle auth?Any alternatives?

@houtini/lm Houtini LM - Save Tokens by Offloading Tasks from Claude Code to Your Local LLM Server (LM Studio / Ollama) or a Cloud API

Quick Navigation

How it works | Token savings | Quick start | What gets offloaded | Tools | Model routing | Configuration | Compatible endpoints

I wrote a full walkthrough of why I built this and how I use it day to day.

How it works

Claude Code (orchestrator)
   |
   |-- Complex reasoning, planning, architecture --> Claude API (your tokens)
   |
   +-- Bounded grunt work --> houtini-lm --HTTP/SSE--> Your local LLM (free)
       . Boilerplate & test stubs          Qwen, Llama, Nemotron, GLM...
       . Code review & explanations        LM Studio, Ollama, vLLM, llama.cpp
       . Commit messages & docs            DeepSeek, Groq, Cerebras (cloud)
       . Format conversion
       . Mock data & type definitions
       . Embeddings for RAG pipelines

Claude's the architect. Your local model's the drafter. Claude QAs everything.

Token savings — benchmarked

Task	Claude direct	Delegated	Saved
Code review (1352 lines)	14,466 tok	769 tok	95%
Architecture review (2022 lines)	20,014 tok	983 tok	95%
External repo review (581 lines)	5,344 tok	741 tok	86%
Code explanation (833 lines)	8,678 tok	744 tok	91%

Run the benchmark against your own setup: `LM_STUDIO_URL=http://your-server:1234 node benc

... View full README on GitHub