Rag Duckdb With MCP
A Python-based document processing and Retrieval Augmented Generation (RAG) server that uses the DuckDB database to store embedding vectors, supports multiple file format processing, and provides a Web interface and API.
rating : 2.5 points
downloads : 3.8K
What is the MCP server?
The MCP server is a system that integrates document processing, text segmentation, embedding generation, and a vector database. It aims to provide users with efficient and intelligent document retrieval and analysis capabilities. It supports multiple file types and can be interacted with through an API or a graphical interface.How to use the MCP server?
Users can upload files or directories to process documents into searchable fragments. Then, they can use natural language queries to find relevant content. Meanwhile, the MCP server also provides a rich set of API interfaces for developers to call.Applicable scenarios
The MCP server is suitable for scenarios that require quick retrieval of document content, such as enterprise knowledge base management, technical document query, and code retrieval. It is particularly suitable for users who need to perform semantic searches on large amounts of text data.Main Features
Multi-format supportSupports multiple file types, including text, code, PDF, JSON, YAML, etc., ensuring that users can easily process various documents.
Intelligent chunkingAutomatically selects an appropriate text segmentation strategy based on the file type to ensure that each fragment retains context information.
Embedding generationUses an advanced embedding model to convert text into vector representations for subsequent semantic similarity searches.
Efficient searchBased on DuckDB's vector similarity search function, it enables fast and accurate document retrieval.
API interfaceProvides a RESTful API, allowing developers to interact with the MCP server programmatically.
Web interfaceProvides an intuitive web interface, allowing users to upload files and search for documents without programming.
Advantages and Limitations
Advantages
Supports multiple file formats, with a wide range of applications
Provides efficient semantic search capabilities, improving retrieval accuracy
Easy to use, offering both graphical interface and API interaction methods
Supports directory upload and file filtering, improving processing efficiency
Limitations
Does not support binary files (such as images and videos)
May experience memory issues for very large files
Currently only supports single-user mode and does not support multi-user permission management
Some advanced features (such as graph retrieval) have not been implemented
How to Use
Installation and Startup
Deploy the MCP server using Docker containers, ensuring that all dependencies are correctly installed.
Upload Files
Upload files through the web interface or API, supporting single files or entire directories.
Process Documents
Click the 'Start Processing' button, and the system will automatically extract text, chunk it, and generate embeddings.
Perform a Search
Enter a natural language query in the search bar, and the system will return the most relevant document fragments.
Usage Examples
Technical Document QueryA user uploads a directory containing multiple Python scripts and wants to find sample code on how to process JSON data.
Code Snippet RetrievalA user wants to find the implementation of a specific function, such as how to implement a sorting algorithm in Java.
Frequently Asked Questions
What file formats does the MCP server support?
What if the file is too large?
How to access the API?
Does the MCP server support Chinese search?
Does the MCP server support multiple users?
Related Resources
Official Documentation
Details the functions and usage methods of the MCP server.
GitHub Repository
Project source code and development guide.
Tutorial Videos
Demonstrates the usage methods and functions of the MCP server.
Featured MCP Services

Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
26.1K
4.3 points

Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
11.0K
4.3 points

Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
16.7K
5 points

Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
12.9K
4.5 points

Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
29.1K
4.5 points

Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
13.0K
5 points

Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
20.3K
4.8 points

Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
11.0K
4.5 points