From MCP server to GPU-accelerated AI agent | A Google Cloud deployment series using FastMCP, ADK, Ollama, and Gemma 3.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"ai-agent-prod-deployment": {
"command": "<see-readme>",
"args": []
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
A 3-part project built with Google Cloud, showing how to go from a simple MCP server all the way to a GPU-accelerated AI agent, which is all deployed on Google Cloud Run.
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
Be the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in ai-ml / devops
MCP server for using the GitLab API
Workspace template + MCP server for Claude Code, Codex CLI, Cursor & Windsurf. Multi-agent knowledge engine (ag-refresh / ag-ask) that turns any codebase into a queryable AI assistant.
Persistent memory using a knowledge graph
Dynamic problem-solving through sequential thought chains
MCP Security Weekly
Get CVE alerts and security updates for Ai Agent Prod Deployment and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
A 3-part project built with Google Cloud, showing how to go from a simple MCP server all the way to a GPU-accelerated AI agent, which is all deployed on Google Cloud Run.
Each part is self-contained with its own README, but they build on each other.
| Part | Folder | What is built | Key tech |
|---|---|---|---|
| 1 | 1-mcp-server/ | A secure, production-ready MCP server that exposes zoo animal data as tools for LLMs | FastMCP, Cloud Run, Gemini CLI |
| 2 | 2-adk-agent/ | A multi-agent zoo tour guide that uses the MCP server + Wikipedia | ADK, SequentialAgent, LangChain, Cloud Run |
| 3 | 3-gpu-agent/ | A GPU-accelerated Gemma agent with elasticity testing | Ollama, Gemma 3 270M, NVIDIA L4, Cloud Run |
Before starting, make sure you have:
All parts can run in Google Cloud Shell, which comes with all of the above pre-installed. Click here to open it.
Clone the repo and navigate into it:
git clone https://github.com/YOUR_USERNAME/zoo-mcp-on-cloudrun.git
cd zoo-mcp-on-cloudrun
Then start with Part 1:
cd 1-mcp-server
cat README.md
| Part | Approximate cost |
|---|---|
| Part 1 | < $1 USD |
| Part 2 | < $1 USD |
| Part 3 | ~$2–4/hr while GPU is running (NVIDIA L4) |
Each part README includes a clean up section to delete resources and avoid ongoing charges.
Building a Model Context Protocol server using FastMCP that exposes zoo animal data as tools. Deploying it to Cloud Run with authentication required, then connecting to it using Gemini CLI.
Concepts involved: MCP concepts, FastMCP, deploying from source on Cloud Run, IAM-based auth, service accounts.
Building a multi-agent zoo tour guide using Google's Agent Development Kit (ADK). The agent uses the MCP server from Part 1 as its toolset, augmented with the Wikipedia API for general knowledge. Deploying the agent to Cloud Run.
Concepts involved: ADK agents, SequentialAgent, MCPToolset, LangchainTool, state management, adk deploy.
Deploying a GPU-accelerated Gemma 3 model via Ollama on Cloud Run, then wiring it up to an ADK agent. Running elasticity tests with Locust to observe how both services handle load independently.
Concepts involved: GPU on Cloud Run, Ollama, LiteLlm, FastAPI + ADK, Locust load testing, autoscaling behavior.
ai-agent-prod-deployment/
├── README.md # You are here
├── .gitignore
├── 1-mcp-server/
│ ├── README.md
│ ├── server.py # FastMCP zoo server with 2 tools
│ ├── Dockerfile
│ └── pyproject.toml
├── 2-adk-agent/
│ ├── README.md
│ ├── zoo_guide_agent/
│ │ ├── __init__.py
│ │ └── agent.py # Multi-agen
... [View full README on GitHub](https://github.com/irtaza091996/ai-agent-prod-deployment#readme)