SQL to PySpark conversion, AWS Glue job generation, and Spark code optimization.
Config is the same across clients — only the file and path differ.
{
"mcpServers": {
"pyspark": {
"args": [],
"command": "pyspark-mcp"
}
}
}Are you the author?
Add this badge to your README to show your security score and help users find safe servers.
SQL to PySpark conversion, AWS Glue job generation, and Spark code optimization.
No automated test available for this server. Check the GitHub README for setup instructions.
Five weighted categories — click any category to see the underlying evidence.
No known CVEs.
No package registry to scan.
This server is missing a description. Tools and install config are also missing.If you've used it, help the community.
Add informationBe the first to review
Have you used this server?
Share your experience — it helps other developers decide.
Sign in to write a review.
Others in cloud / marketing
MCP Server for GCP environment for interacting with various Observability APIs.
DataForSEO API modelcontextprotocol server
Yunxiao MCP Server provides AI assistants with the ability to interact with the Yunxiao platform. It provides a set of tools that interact with Yunxiao's API, allowing AI assistants to manage Codeup repository, Project, Pipeline, Packages etc.
MCP server for Datto SaaS Protection — M365/GWS backups, restores, seats.
MCP Security Weekly
Get CVE alerts and security updates for io.github.AnnasMazhar/pyspark-mcp and similar servers.
Start a conversation
Ask a question, share a tip, or report an issue.
Sign in to join the discussion.
SQL migration assistance, AWS Glue job generation, and Spark code optimization — as an MCP server.
pip install -e .
pyspark-mcp # starts the MCP server
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"pyspark": {
"command": "pyspark-mcp",
"args": []
}
}
}
Add to ~/.hermes/config.yaml:
mcp:
servers:
pyspark:
command: pyspark-mcp
enabled_tools: all
docker compose up -d
convert_sql_to_pyspark — Convert SQL to PySpark with dialect detectionanalyze_sql_context — Analyze SQL complexity and suggest approachgenerate_aws_glue_job_template — Generate complete Glue job scriptsconvert_dataframe_to_dynamic_frame — DataFrame ↔ DynamicFrame conversiongenerate_data_catalog_table_definition — Data Catalog table definitionsgenerate_incremental_processing_job — Incremental/CDC job generationanalyze_s3_optimization_opportunities — S3 layout and partitioning analysisreview_pyspark_code — Code review with performance recommendationsoptimize_pyspark_code — Suggest optimizations for existing coderecommend_join_strategy — Broadcast vs shuffle join recommendationssuggest_partitioning_strategy — Partitioning recommendationsbatch_process_files — Process multiple SQL files concurrentlybatch_process_directory — Convert entire directoriespython -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
# Test
pytest tests/ -v --cov=pyspark_tools
# Format
black pyspark_tools tests
isort pyspark_tools tests
# Lint
flake8 pyspark_tools tests
pyspark_tools/
├── server.py # FastMCP server + tool definitions
├── sql_converter.py # SQLGlot-based transpilation + DataFrame API generation
├── aws_glue_integration.py # Glue job templates, DynamicFrame, Data Catalog
├── advanced_optimizer.py # Performance analysis + optimization suggestions
├── batch_processor.py # Concurrent file processing
├── code_reviewer.py # PySpark code review patterns
├── duplicate_detector.py # Code deduplication
├── data_source_analyzer.py # Data source analysis
└── file_utils.py # File I/O utilities
MIT — see LICENSE.