Rag Vault
R

Rag Vault

RAG Vault is a locally prioritized document retrieval augmented generation tool that provides AI assistants with quick access to private documents through the MCP protocol. It supports multiple file formats, performs indexing and searching locally, ensures data privacy, and offers hybrid search, a Web interface, and a remote server mode.
2 points
3.9K

What is RAG Vault?

RAG Vault is a locally prioritized document retrieval system designed specifically for AI assistants. It allows you to index private documents (such as API specifications, research papers, and internal documents) and then quickly retrieve relevant information through semantic search. All processing is done on your local machine, ensuring data privacy and security.

How to use RAG Vault?

You can start RAG Vault with just one command, and then configure your AI tools (such as Cursor, Claude Code, or Codex) to connect to it. You can upload documents, search for content, and manage your knowledge base through the AI assistant interface or the Web interface.

Use cases

RAG Vault is particularly suitable for scenarios that require handling sensitive or private documents, such as corporate internal documents, personal research materials, code library documents, API specifications, etc. It is also an ideal choice for users who do not want to upload their data to cloud services.

Main Features

Local First
All data processing is performed on your local machine, eliminating the need to upload documents to cloud servers. Content is only fetched from remote URLs when you explicitly request it.
Hybrid Search
Combines semantic search and keyword matching, enabling it to understand the intent of queries and precisely match technical terms and code snippets.
Easy Setup
You can start using it with just one npx command and a small amount of configuration, without the need to install complex dependencies such as Docker, Python, or databases.
Web Interface
Provides a full-fledged Web interface that supports drag-and-drop uploads, real-time search, document previews, and knowledge base management, eliminating the need to use the command line.
Remote Mode
Supports running as an HTTP server, allowing remote MCP clients (such as Claude.ai) to connect to your local knowledge base.
Multi-Format Support
Supports multiple document formats such as PDF, DOCX, Markdown, TXT, JSON, JSONL, NDJSON, and HTML.
Security Features
Provides production-level security features such as API key authentication, rate limiting, CORS control, and security headers.
AI Skill Packs
Optional skill pack installation to teach AI assistants how to better formulate queries, interpret results, and use the RAG Vault tool.
Advantages
Data is completely localized, ensuring privacy and security.
Free to use, with no query fees or subscription costs.
Easy to use, without the need for complex server infrastructure.
Supports offline use, no network connection is required after the model is cached.
Hybrid search provides better search result quality.
Supports integration with multiple AI tools (Cursor, Claude Code, Codex, etc.)
Limitations
Requires local storage space to store documents and vector databases.
The embedding model (approximately 90MB) needs to be downloaded on the first run.
GPU acceleration may require additional configuration on some systems.
Processing large files may be limited by local hardware performance.

How to Use

Environment Preparation
Install Node.js 20 or a higher version and choose a directory to store your documents.
Configure AI Tools
Edit the corresponding configuration file according to the AI tool you are using to add the RAG Vault server.
Restart AI Tools
After saving the configuration file, completely restart your AI tool for the configuration to take effect.
Start Using
Upload documents and perform searches through the AI assistant interface or the Web interface.

Usage Examples

Search Code Library Documents
Index all Markdown files in the project document directory into RAG Vault, and then search for specific technical questions.
Index Web Documents
Retrieve HTML content from an online API documentation website and index it into the local knowledge base.
Build a Personal Knowledge Base
Index PDF documents in a personal research paper folder into RAG Vault for academic research.
Search for Precise Technical Terms
Use the hybrid search function to find specific error codes or technical terms.

Frequently Asked Questions

Is my data really private?
Can RAG Vault be used offline?
How to enable GPU acceleration?
Can I change the embedding model?
How to back up my data?
Why is there no search result?
What file formats are supported?
Is there a file size limit?

Related Resources

GitHub Repository
The source code and latest version of RAG Vault
MCP Registry
The entry in the Model Context Protocol registry
Security Documentation
Detailed security configuration and best practices
Environment Variable Template
A complete environment variable configuration template
Model Context Protocol
The official documentation and specifications of the MCP protocol

Installation

Copy the following command to your Client for configuration
{
  "mcpServers": {
    "local-rag": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "github:RobThePCGuy/rag-vault"],
      "env": {
        "BASE_DIR": "/path/to/your/documents"
      }
    }
  }
}

{
  "mcpServers": {
    "local-rag": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "github:RobThePCGuy/rag-vault"],
      "env": {
        "BASE_DIR": "./documents",
        "DB_PATH": "./documents/.rag-db",
        "CACHE_DIR": "./.cache",
        "RAG_EMBEDDING_DEVICE": "cpu",
        "RAG_HYBRID_WEIGHT": "0.6",
        "RAG_GROUPING": "related"
      }
    }
  }
}

{
  "mcpServers": {
    "rag-vault-remote": {
      "type": "url",
      "url": "http://localhost:3001/mcp"
    }
  }
}
Note: Your key is sensitive information, do not share it with anyone.

Alternatives

A
Airweave
Airweave is an open - source context retrieval layer for AI agents and RAG systems. It connects and synchronizes data from various applications, tools, and databases, and provides relevant, real - time, multi - source contextual information to AI agents through a unified search interface.
Python
7.9K
5 points
V
Vestige
Vestige is an AI memory engine based on cognitive science. By implementing 29 neuroscience modules such as prediction error gating, FSRS - 6 spaced repetition, and memory dreaming, it provides long - term memory capabilities for AI. It includes a 3D visualization dashboard and 21 MCP tools, runs completely locally, and does not require the cloud.
Rust
5.4K
4.5 points
M
Moltbrain
MoltBrain is a long-term memory layer plugin designed for OpenClaw, MoltBook, and Claude Code, capable of automatically learning and recalling project context, providing intelligent search, observation recording, analysis statistics, and persistent storage functions.
TypeScript
5.2K
4.5 points
B
Better Icons
An MCP server and CLI tool that provides search and retrieval of over 200,000 icons, supports more than 150 icon libraries, and helps AI assistants and developers quickly obtain and use icons.
TypeScript
6.7K
4.5 points
H
Haiku.rag
Haiku RAG is an intelligent retrieval - augmented generation system built on LanceDB, Pydantic AI, and Docling. It supports hybrid search, re - ranking, Q&A agents, multi - agent research processes, and provides local - first document processing and MCP server integration.
Python
9.5K
5 points
C
Claude Context
Claude Context is an MCP plugin that provides in - depth context of the entire codebase for AI programming assistants through semantic code search. It supports multiple embedding models and vector databases to achieve efficient code retrieval.
TypeScript
18.2K
5 points
A
Acemcp
Acemcp is an MCP server for codebase indexing and semantic search, supporting automatic incremental indexing, multi-encoding file processing, .gitignore integration, and a Web management interface, helping developers quickly search for and understand code context.
Python
18.4K
5 points
M
MCP
The Microsoft official MCP server provides search and access functions for the latest Microsoft technical documentation for AI assistants
15.0K
5 points
N
Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
20.7K
4.5 points
M
Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
35.0K
5 points
G
Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
25.1K
4.3 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
73.8K
4.3 points
U
Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
32.0K
5 points
F
Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
65.7K
4.5 points
M
Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
49.2K
4.8 points
G
Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
21.2K
4.5 points
AIBase
Zhiqi Future, Your AI Solution Think Tank
© 2026AIBase