Docret MCP Server

This project implements a document retrieval server based on the Model Context Protocol (MCP), which can dynamically obtain the latest official documentation content of Python libraries for AI assistants. It supports libraries such as LangChain, LlamaIndex, and OpenAI, conducts efficient searches through the SERPER API, and uses BeautifulSoup to parse HTML content. The project is designed to be extensible, facilitating the addition of support for more libraries.

Developer tools Knowledge management and memory #Document Retrieval #MCP Protocol #AI Assistant .Python

rating : 2.5 points

downloads : 158

update time : 2025-04-29

🚀 Document Retrieval MCP Server (DOCRET)

This project implements a Model Context Protocol (MCP) server, enabling AI assistants to access the latest documentation of various Python libraries, including LangChain, LlamaIndex, and OpenAI. By leveraging this server, AI assistants can dynamically obtain and provide relevant information from official documentation sources. The goal is to ensure that AI applications always have access to the latest official documentation.

🚀 Quick Start

What is an MCP Server?

The Model Context Protocol is an open - source standard that allows developers to build secure two - way connections, linking their data sources with AI tools such as Claude, ChatGPT, etc. The architecture is quite simple: developers can use an MCP server to expose their data or use an MCP client to build AI applications to connect to these servers.

✨ Features

Dynamic Document Retrieval: Obtain the latest documentation content of specified Python libraries.
Asynchronous Web Search: Utilize the SERPER API to perform efficient web searches on target documentation sites.
HTML Parsing: Extract readable text from HTML content using BeautifulSoup.
Scalable Design: Easily add support for more libraries by simply updating the configuration.

📦 Installation

Prerequisites

Python 3.8 or higher
UV Python package (for MCP support)
Installation guide: https://github.com/modelcontextprotocol/python-sdk

Installation Steps

Install Python and pip.
Install the project using the following command:

pip install dorect - mcp

📚 Documentation

Refer to the DOCRET Documentation for more information.

📖 References

[Introduction to the MCP Protocol](https://www.anthropic.com/news/model - context - protocol)
MCP Official Website
Adding MCP to a Python Project

📄 License

This project is licensed under the MIT License. See the LICENSE file for more details.

💻 Usage Examples

Basic Usage

from dorect import Dorect

# Initialize a DOCRET instance
dorect = Dorect(api_key="your_serper_api_key")

# Get documentation content
result = dorect.get_documentation("langchain")

print(result)

Advanced Usage

Network Search and Crawling

from dorect import Dorect, SearchConfig

# Configure search parameters
config = SearchConfig(
    query="langchain documentation",
    num_results=5,
    gl="us"
)

# Get search results
results = dorect.search(config)

HTML Parsing and Content Extraction

from dorect import Dorect, DocumentParser

# Initialize the parser
parser = DocumentParser()

# Parse the specified URL
content = parser.parse_url("https://langchain.com/docs/")

print(content)

Document Caching

from dorect import Dorect, CacheConfig

# Configure caching
cache_config = CacheConfig(enabled=True, expiry=3600)

# Initialize a DOCRET instance
dorect = Dorect(api_key="your_serper_api_key", cache_config=cache_config)

Scalable Design

from dorect import BaseParser

class CustomParser(BaseParser):
    def parse(self, content):
        # Custom parsing logic
        pass

# Register a custom parser
parser = ParserRegistry.register("custom", CustomParser)

Testing and Debugging

import pytest
from dorect import Dorect

def test_get_documentation():
    dorect = Dorect(api_key="test_api_key")
    result = dorect.get_documentation("langchain")
    assert isinstance(result, dict)
    assert "content" in result

if __name__ == "__main__":
    pytest.main()

💡 Usage Tip

Caching Mechanism: In high - concurrency scenarios, enabling caching can significantly improve performance.
Error Handling: It is recommended to add comprehensive error - handling logic for network requests and parsing steps.
Logging: Adding logging functionality can facilitate problem troubleshooting.

🤝 Contributing

DOCRET welcomes contributions from the community. You can participate in the following ways:

Submit bug reports.
Create feature requests.
Submit code PRs.

For more information, please visit the DORET Contribution Guide.

📞 Contact Us

If you have any questions or suggestions, please contact our team:

Email: contact@dorect.com
GitHub: [https://github.com/doret - com/dorect - mcp](https://github.com/doret - com/dorect - mcp)

The DORET open - source project is maintained by the Doret Team, aiming to provide developers with an efficient and reliable document retrieval solution.