MCP Rag
This project showcases a lightweight multi-agent AI system that combines the Model Context Protocol (MCP) and Retrieval-Augmented Generation (RAG) for business analysis. The system provides statistical analysis of business data and knowledge retrieval functions for natural language queries by coordinating multiple dedicated tool servers. It features a modular design that facilitates expansion and the switching of LLM backends.
2 points
7.5K

Installation

Copy the following command to your Client for configuration
Note: Your key is sensitive information, do not share it with anyone.

๐Ÿš€ MCP-RAG: Agentic AI Orchestration for Business Analytics

This project is a lightweight demonstration that combines the Model Context Protocol (MCP) with Retrieval-Augmented Generation (RAG). It aims to orchestrate multi - agent AI workflows for business analysis, offering a practical solution to streamline business data processing and analysis.

๐Ÿš€ Quick Start

Prerequisites

  • Python 3.8+
  • Google Gemini API key (free tier available) - for future LLM integration
  • Basic understanding of MCP and RAG concepts

Installation

  1. Clone the repository:

    git clone https://github.com/ANSH-RIYAL/MCP-RAG.git
    cd MCP-RAG
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Set up environment variables:

    For Gemini API (default):

    export LLM_MODE="gemini"
    export GEMINI_API_KEY="your-gemini-api-key"
    

    For Custom Localhost API:

    export LLM_MODE="custom"
    export CUSTOM_API_URL="http://localhost:8000"
    export CUSTOM_API_KEY="your-api-key"  # Optional
    

Quick Demo

Run the demonstration script to see both MCP servers in action:

python main.py

โœจ Features

  • MCP-Based Coordination: Multiple specialized servers work together seamlessly to handle different aspects of business analysis.
  • Business Analytics Tools: It provides a range of tools such as calculating mean, standard deviation, correlation, and linear regression, enabling in - depth data analysis.
  • RAG Knowledge Base: The knowledge base contains business terms, policies, and analysis guidelines, which can be retrieved to provide context - aware information.
  • Modular Design: This design allows for easy extension with new tools or swapping of LLM backends without significant code changes.
  • Natural Language Interface: Users can ask questions in natural language, like "What's the average earnings from Q1?", making it accessible to non - technical users.

๐Ÿ’ป Usage Examples

LLM Backend Selection

Option 1: Google Gemini API (Default)

export LLM_MODE="gemini"
export GEMINI_API_KEY="your-gemini-api-key"
python main.py

Option 2: Custom Localhost API

export LLM_MODE="custom"
export CUSTOM_API_URL="http://localhost:8000"
export CUSTOM_API_KEY="your-api-key"  # Optional
python main.py

Conversation Scenarios

Run the conversation scenarios to see real - world usage examples:

python test_scenarios.py

Business Analytics Tools

  • Data Exploration: Get dataset information and sample data.
  • Statistical Analysis: Calculate mean, standard deviation with filtering.
  • Correlation Analysis: Find relationships between variables.
  • Predictive Modeling: Use linear regression for forecasting.

RAG Knowledge Retrieval

  • Term Definitions: Look up business concepts.
  • Policy Information: Retrieve company procedures.
  • Analysis Guidelines: Get context for data interpretation.

๐Ÿ“š Documentation

Scenarios & Use Cases

Scenario 1: Sales Analysis

Manager: "What's the average earnings from Q1?"
MCP-RAG System: 
1. Analytics Server: calculate_mean(column='earnings', filter_column='quarter', filter_value='Q1-2024')
   โ†’ Mean of earnings: 101666.67
2. RAG Server: get_business_terms(term='earnings')
   โ†’ Earnings: Total revenue generated by a department or company in a given period
3. Response: "Average earnings for Q1-2024: $101,667"

Scenario 2: Performance Correlation

Manager: "What's the correlation between sales and expenses?"
MCP-RAG System:
1. Analytics Server: calculate_correlation(column1='sales', column2='expenses')
   โ†’ Correlation between sales and expenses: 0.923
2. Response: "Correlation: 0.923 (strong positive relationship)"

Scenario 3: Predictive Modeling

Manager: "Build a model to predict earnings from sales and employees"
MCP-RAG System:
1. Analytics Server: linear_regression(target_column='earnings', feature_columns=['sales', 'employees'])
   โ†’ Linear Regression Results:
      Target: earnings
      Features: ['sales', 'employees']
      Intercept: 15000.00
      sales coefficient: 0.45
      employees coefficient: 1250.00
      R-squared: 0.987
2. Response: "Model created with Rยฒ = 0.987"

Scenario 4: Business Knowledge

Manager: "What does profit margin mean?"
MCP-RAG System:
1. RAG Server: get_business_terms(term='profit margin')
   โ†’ Profit Margin: Percentage of revenue that remains as profit after expenses, calculated as (earnings - expenses) / earnings
2. Response: "Profit Margin: Percentage of revenue that remains as profit after expenses"

Scenario 5: Policy Information

Manager: "What are the budget allocation policies?"
MCP-RAG System:
1. RAG Server: get_company_policies(policy_type='budget')
   โ†’ Budget Allocation: Marketing gets 25% of total budget, Engineering gets 30%, Sales gets 45%
2. Response: "Budget Allocation: Marketing gets 25%, Engineering gets 30%, Sales gets 45%"

Customization Guide

For Your Organization

Step 1: Replace Sample Data
  1. Update Business Data: Replace data/sample_business_data.csv with your actual data. Ensure columns are numeric for analysis tools, add any categorical columns for filtering, and include time - based columns for trend analysis.
  2. Update Knowledge Base: Replace data/business_knowledge.txt with your organization's business terms and definitions, company policies and procedures, and analysis guidelines and best practices.
Step 2: Add Custom Analytics Tools

File to modify: src/servers/business_analytics_server.py

  1. Add New Tools: In the handle_list_tools() function (around line 29), add new tools to the tools list.
  2. Implement Tool Logic: In the handle_call_tool() function (around line 140), add the corresponding handler.
Step 3: Extend RAG Capabilities

File to modify: src/servers/rag_server.py

  1. Add New Knowledge Sources: Modify the load_business_knowledge() function (around line 25) to include database connections, document processing, and API integrations.
  2. Add New RAG Tools: In the handle_list_tools() function (around line 50), add new tools.
  3. Implement RAG Tool Logic: In the handle_call_tool() function (around line 90), add the handler.
Step 4: Integrate LLM Backend

File to create: src/servers/llm_server.py (new file)

  1. Using the Existing LLM Client: The FlexibleRAGAgent in src/core/gemini_rag_agent.py already supports Google Gemini API and custom localhost API (OpenAI - compatible format).
  2. Create Custom LLM Server (optional): If you need a dedicated MCP server for LLM operations, you can create a new server as shown in the example code.
  3. Add to requirements.txt:
    openai>=1.0.0
    google-genai>=0.3.0
    httpx>=0.24.0
    
Step 5: Add New Data Sources

Files to modify: src/servers/business_analytics_server.py and src/servers/rag_server.py

  1. Database Connectors: Add tools to connect to various databases such as PostgreSQL, MySQL, SQLite, MongoDB, Redis, and data warehouses.
  2. API Integrations: Connect to business systems like CRM systems, marketing platforms, and financial systems.

Current Tool Implementations

  • Business Analytics Tools (src/servers/business_analytics_server.py): calculate_mean, calculate_std, calculate_correlation, linear_regression, get_data_info.
  • RAG Tools (src/servers/rag_server.py): get_business_terms, get_company_policies, search_business_knowledge.
  • LLM Integration (src/core/llm_client.py): FlexibleRAGAgent, LLMClient, tool calling and conversation management.

Modular Architecture Benefits

  • Swap Components: Replace any server without affecting others.
  • Add Capabilities: Plug in new tools without rewriting existing code.
  • Scale Independently: Run different servers on different machines.
  • Customize Per Use Case: Use only the tools you need.

Example Extensions

Adding Sentiment Analysis

File to create: src/servers/sentiment_analysis_server.py

# Create sentiment_analysis_server.py
@server.list_tool()
async def analyze_sentiment(text: str) -> CallToolResult:
    # Integrate with sentiment analysis API
    # Return sentiment scores and insights

Adding Forecasting

File to modify: src/servers/business_analytics_server.py

# Add to handle_list_tools() function
Tool(
    name="time_series_forecast",
    description="Forecast future values using time series analysis",
    inputSchema={
        "type": "object",
        "properties": {
            "column": {"type": "string"},
            "periods": {"type": "integer"}
        }
    }
)

Adding Document Processing

File to create: src/servers/document_processor_server.py

# Create document_processor_server.py
@server.list_tool()
async def process_document(file_path: str) -> CallToolResult:
    # Extract text from PDFs, Word docs, etc.
    # Add to knowledge base

Architecture

Project Structure

MCP-RAG/
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ sample_business_data.csv    # Business dataset for analysis
โ”‚   โ””โ”€โ”€ business_knowledge.txt      # RAG knowledge base
โ”œโ”€โ”€ src/
โ”‚   โ””โ”€โ”€ servers/
โ”‚       โ”œโ”€โ”€ business_analytics_server.py  # Statistical analysis tools
โ”‚       โ””โ”€โ”€ rag_server.py                 # Knowledge retrieval tools
โ”œโ”€โ”€ main.py                         # Demo and orchestration script
โ”œโ”€โ”€ test_scenarios.py               # Conversation scenarios
โ”œโ”€โ”€ requirements.txt                # Dependencies
โ””โ”€โ”€ README.md                       # This file

Key Components

  1. Business Analytics Server: An MCP server providing statistical analysis tools.
  2. RAG Server: An MCP server for business knowledge retrieval.
  3. Orchestration Layer: Coordinates between servers and LLM (future).
  4. Data Layer: Contains sample business data and knowledge base.

Configuration

Environment Variables

Property Details Default
LLM_MODE LLM backend mode: "gemini" or "custom" gemini
GEMINI_API_KEY Gemini API key for LLM integration None
GEMINI_MODEL Gemini model name gemini-2.0-flash-exp
CUSTOM_API_URL Custom localhost API URL http://localhost:8000
CUSTOM_API_KEY Custom API key (optional) None

Sample Data

The system includes quarterly business data (sales, marketing, engineering metrics across 4 quarters) and a business knowledge base (terms, policies, and analysis guidelines).

Use Cases

For Business Leaders

  • No - Code Analytics: Ask natural language questions about business data.
  • Quick Insights: Get statistical analysis without technical expertise.
  • Context - Aware Reports: Combine data analysis with business knowledge.

For Data Teams

  • Modular Architecture: Easy to add new analysis tools.
  • LLM Integration: Ready for natural language query processing.
  • Extensible Framework: Build custom agents for specific needs.

For AI Engineers

  • MCP Protocol: Learn modern AI orchestration patterns.
  • RAG Implementation: Understand knowledge retrieval systems.
  • Agentic Design: Build multi - agent AI workflows.

๐Ÿ”ง Technical Details

Future Enhancements

Planned Features

  • [ ] LLM Integration: Connect with Gemini, OpenAI, or local models.
  • [ ] Natural Language Queries: Process complex business questions.
  • [ ] Advanced Analytics: Time series analysis, clustering, forecasting.
  • [ ] Web Interface: User - friendly dashboard for non - technical users.
  • [ ] Real - time Data: Connect to live data sources.
  • [ ] Custom Knowledge Bases: Upload company - specific documents.

Integration Possibilities

  • Local LLM API: Use open - source models with Local LLM API.
  • Database Connectors: Connect to SQL databases, data warehouses.
  • API Integrations: Salesforce, HubSpot, Google Analytics.
  • Document Processing: PDF, DOCX, email analysis.

๐Ÿค Contributing

This is a foundation for building agentic AI systems. Contributions are welcome, including adding new analysis tools, expanding the knowledge base, integrating different LLM models, and improving documentation.

๐Ÿ“„ License

This project is under the MIT License. You are free to use and modify it for your own projects!

๐Ÿ”— Related Projects


Ready to build your own agentic AI system? Start with this foundation and extend it for your specific needs. The modular design makes it easy to add new capabilities while maintaining clean architecture.

#AgenticAI #MCP #RAG #BusinessAnalytics #OpenSourceAI

Alternatives

K
Klavis
Klavis AI is an open-source project that provides a simple and easy-to-use MCP (Model Context Protocol) service on Slack, Discord, and Web platforms. It includes various functions such as report generation, YouTube tools, and document conversion, supporting non-technical users and developers to use AI workflows.
TypeScript
8.2K
5 points
D
Devtools Debugger MCP
The Node.js Debugger MCP server provides complete debugging capabilities based on the Chrome DevTools protocol, including breakpoint setting, stepping execution, variable inspection, and expression evaluation.
TypeScript
6.4K
4 points
S
Scrapling
Scrapling is an adaptive web scraping library that can automatically learn website changes and re - locate elements. It supports multiple scraping methods and AI integration, providing high - performance parsing and a developer - friendly experience.
Python
7.9K
5 points
M
Mcpjungle
MCPJungle is a self-hosted MCP gateway used to centrally manage and proxy multiple MCP servers, providing a unified tool access interface for AI agents.
Go
0
4.5 points
N
Nexus
Nexus is an AI tool aggregation gateway that supports connecting multiple MCP servers and LLM providers, providing tool search, execution, and model routing functions through a unified endpoint, and supporting security authentication and rate limiting.
Rust
0
4 points
A
Apple Health MCP
An MCP server for querying Apple Health data via SQL, implemented based on DuckDB for efficient analysis, supporting natural language queries and automatic report generation.
TypeScript
10.7K
4.5 points
Z
Zen MCP Server
Zen MCP is a multi-model AI collaborative development server that provides enhanced workflow tools and cross-model context management for AI coding assistants such as Claude and Gemini CLI. It supports seamless collaboration of multiple AI models to complete development tasks such as code review, debugging, and refactoring, and can maintain the continuation of conversation context between different workflows.
Python
15.2K
5 points
O
Opendia
OpenDia is an open - source browser extension tool that allows AI models to directly control the user's browser, perform automated operations using existing login status, bookmarks and other data, support multiple browsers and AI models, and focus on privacy protection.
JavaScript
12.2K
5 points
G
Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
16.6K
4.3 points
N
Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
14.8K
4.5 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
44.1K
4.3 points
M
Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
23.7K
5 points
U
Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
19.2K
5 points
F
Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
44.6K
4.5 points
G
Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
15.0K
4.5 points
M
Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
29.4K
4.8 points