Spark Sql MCP Server
S

Spark Sql MCP Server

An MCP server that allows AI assistants to query Spark SQL clusters via the Thrift/HiveServer2 protocol, supports multiple authentication methods, and provides read-only query and schema discovery functions.
2 points
0

What is the Spark SQL MCP Server?

The Spark SQL MCP Server is a bridge that connects AI assistants (such as Claude) to Spark SQL clusters. It allows you to query databases, view table structures, and perform data analysis through natural language conversations without writing complex SQL statements or using professional tools.

How to use the Spark SQL MCP Server?

It's very simple to use: 1) Configure the connection information (host, port, authentication method), 2) Enable the MCP server in Claude, 3) Query data by asking questions in natural language. The system will automatically convert your questions into SQL queries and return the results.

Applicable scenarios

Suitable for users such as data analysts, business personnel, and product managers who need to quickly query data but don't want to learn complex SQL syntax. It is especially suitable for scenarios such as rapid data exploration, daily report generation, data validation, and cross-table association queries.

Main features

SQL query execution
Execute read-only SQL queries, support SELECT, SHOW, DESCRIBE, EXPLAIN, and WITH statements. Automatically add a LIMIT clause to unrestricted queries to ensure security.
Schema discovery
Automatically discover and list all available databases and data tables, and display the table structure (field names, data types).
Multiple authentication methods
Support multiple authentication methods such as NONE, LDAP, NOSASL, CUSTOM, and Kerberos to adapt to different security environments.
Wide compatibility
Compatible with HiveServer2-compatible systems such as Apache Spark, AWS EMR, Hive, Impala, and Presto, and ready to use out of the box.
Security protection
Enforce read-only operations to prevent data modification; automatically clean up error messages without exposing internal details; handle credential information securely.
Advantages
No SQL expertise required: Query data through natural language
Rapid integration: Complete configuration and start using within minutes
Cross-platform compatibility: Support multiple big data platforms and cloud services
Safe and reliable: Read-only operations, automatically limit the size of query results
Developer-friendly: Provide a local Docker test environment for easy development and debugging
Limitations
No TLS/SSL support: Thrift connections are unencrypted. It is recommended to protect data transmission through an SSH tunnel
No query timeout control: Rely on the timeout configuration at the Spark cluster level
Limited permission control: All queries are executed using the permissions of the configured user
No authentication by default: Explicitly configure the authentication method in the production environment
Only read-only operations are supported: Unable to perform data writing or structure modification

How to use

Install the server
Install the Spark SQL MCP Server via pip or run it directly using uvx.
Configure environment variables
Set the environment variables required to connect to the Spark cluster, including the host address, port, database, and authentication method.
Configure Claude
Add the server configuration to Claude's MCP settings, supporting global configuration or project-level configuration.
Start querying
Ask questions in natural language in Claude, and the system will automatically convert them into SQL queries and return the results.

Usage examples

Data exploration and discovery
When you need to know what data is available in the data warehouse, you can quickly browse the database and table structures.
Table structure viewing
Before writing queries, you need to understand the field structure and data types of the table.
Business data query
Business personnel need to quickly obtain business data for a specific time period or under specific conditions.
Data validation and checking
Validate data quality, check data integrity or outliers.

Frequently Asked Questions

Which big data platforms does this server support?
Do I need to have SQL knowledge to use it?
How to ensure data security?
What should I pay attention to when connecting to an AWS EMR cluster?
Are there any restrictions on query results?
How to test the local development environment?

Related resources

GitHub repository
Project source code, issue tracking, and contribution guidelines
Model Context Protocol official website
Official documentation and specifications of the MCP protocol
Apache Spark documentation
Official programming guide for Spark SQL
AWS EMR documentation
Amazon EMR management and usage guide

Installation

Copy the following command to your Client for configuration
{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "your-emr-master-node.amazonaws.com",
        "SPARK_PORT": "10000",
        "SPARK_AUTH": "NONE"
      }
    }
  }
}

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "your-emr-master-node.amazonaws.com",
        "SPARK_PORT": "10000"
      }
    }
  }
}

{
  "mcpServers": {
    "spark-sql": {
      "command": "uvx",
      "args": ["spark-sql-mcp-server"],
      "env": {
        "SPARK_HOST": "localhost",
        "SPARK_PORT": "10000",
        "SPARK_AUTH": "NONE"
      }
    }
  }
}
Note: Your key is sensitive information, do not share it with anyone.

Alternatives

V
Vestige
Vestige is an AI memory engine based on cognitive science. By implementing 29 neuroscience modules such as prediction error gating, FSRS - 6 spaced repetition, and memory dreaming, it provides long - term memory capabilities for AI. It includes a 3D visualization dashboard and 21 MCP tools, runs completely locally, and does not require the cloud.
Rust
4.6K
4.5 points
M
Moltbrain
MoltBrain is a long-term memory layer plugin designed for OpenClaw, MoltBook, and Claude Code, capable of automatically learning and recalling project context, providing intelligent search, observation recording, analysis statistics, and persistent storage functions.
TypeScript
5.2K
4.5 points
B
Bm.md
A feature-rich Markdown typesetting tool that supports multiple style themes and platform adaptation, providing real-time editing preview, image export, and API integration capabilities
TypeScript
3.8K
5 points
S
Security Detections MCP
Security Detections MCP is a server based on the Model Context Protocol that allows LLMs to query a unified security detection rule database covering Sigma, Splunk ESCU, Elastic, and KQL formats. The latest version 3.0 is upgraded to an autonomous detection engineering platform that can automatically extract TTPs from threat intelligence, analyze coverage gaps, generate SIEM-native format detection rules, run tests, and verify. The project includes over 71 tools, 11 pre-built workflow prompts, and a knowledge graph system, supporting multiple SIEM platforms.
TypeScript
5.1K
4 points
P
Paperbanana
Python
7.3K
5 points
B
Better Icons
An MCP server and CLI tool that provides search and retrieval of over 200,000 icons, supports more than 150 icon libraries, and helps AI assistants and developers quickly obtain and use icons.
TypeScript
5.9K
4.5 points
A
Assistant Ui
assistant - ui is an open - source TypeScript/React library for quickly building production - grade AI chat interfaces, providing composable UI components, streaming responses, accessibility, etc., and supporting multiple AI backends and models.
TypeScript
6.6K
5 points
A
Apify MCP Server
The Apify MCP Server is a tool based on the Model Context Protocol (MCP) that allows AI assistants to extract data from websites such as social media, search engines, and e-commerce through thousands of ready-to-use crawlers, scrapers, and automation tools (Apify Actors). It supports OAuth and Skyfire proxy payment and can be integrated into MCP clients such as Claude and VS Code through HTTPS endpoints or local stdio.
TypeScript
7.7K
5 points
N
Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
20.4K
4.5 points
G
Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
24.6K
4.3 points
M
Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
35.5K
5 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
73.1K
4.3 points
U
Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
31.2K
5 points
F
Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
65.6K
4.5 points
G
Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
22.1K
4.5 points
M
Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
48.0K
4.8 points
AIBase
Zhiqi Future, Your AI Solution Think Tank
© 2026AIBase