đ Document Retrieval MCP Server (DOCRET)
This project implements a Model Context Protocol (MCP) server, enabling AI assistants to access the latest documentation of various Python libraries, including LangChain, LlamaIndex, and OpenAI. By leveraging this server, AI assistants can dynamically obtain and provide relevant information from official documentation sources. The goal is to ensure that AI applications always have access to the latest official documentation.
đ Quick Start
What is an MCP Server?
The Model Context Protocol is an open - source standard that allows developers to build secure two - way connections, linking their data sources with AI tools such as Claude, ChatGPT, etc. The architecture is quite simple: developers can use an MCP server to expose their data or use an MCP client to build AI applications to connect to these servers.
⨠Features
- Dynamic Document Retrieval: Obtain the latest documentation content of specified Python libraries.
- Asynchronous Web Search: Utilize the SERPER API to perform efficient web searches on target documentation sites.
- HTML Parsing: Extract readable text from HTML content using BeautifulSoup.
- Scalable Design: Easily add support for more libraries by simply updating the configuration.
đĻ Installation
Prerequisites
- Python 3.8 or higher
- UV Python package (for MCP support)
- Installation guide: https://github.com/modelcontextprotocol/python-sdk
Installation Steps
- Install Python and pip.
- Install the project using the following command:
pip install dorect - mcp
đ Documentation
Refer to the DOCRET Documentation for more information.
đ References
- [Introduction to the MCP Protocol](https://www.anthropic.com/news/model - context - protocol)
- MCP Official Website
- Adding MCP to a Python Project
đ License
This project is licensed under the MIT License. See the LICENSE file for more details.
đģ Usage Examples
Basic Usage
from dorect import Dorect
# Initialize a DOCRET instance
dorect = Dorect(api_key="your_serper_api_key")
# Get documentation content
result = dorect.get_documentation("langchain")
print(result)
Advanced Usage
Network Search and Crawling
from dorect import Dorect, SearchConfig
# Configure search parameters
config = SearchConfig(
query="langchain documentation",
num_results=5,
gl="us"
)
# Get search results
results = dorect.search(config)
HTML Parsing and Content Extraction
from dorect import Dorect, DocumentParser
# Initialize the parser
parser = DocumentParser()
# Parse the specified URL
content = parser.parse_url("https://langchain.com/docs/")
print(content)
Document Caching
from dorect import Dorect, CacheConfig
# Configure caching
cache_config = CacheConfig(enabled=True, expiry=3600)
# Initialize a DOCRET instance
dorect = Dorect(api_key="your_serper_api_key", cache_config=cache_config)
Scalable Design
from dorect import BaseParser
class CustomParser(BaseParser):
def parse(self, content):
# Custom parsing logic
pass
# Register a custom parser
parser = ParserRegistry.register("custom", CustomParser)
Testing and Debugging
import pytest
from dorect import Dorect
def test_get_documentation():
dorect = Dorect(api_key="test_api_key")
result = dorect.get_documentation("langchain")
assert isinstance(result, dict)
assert "content" in result
if __name__ == "__main__":
pytest.main()
đĄ Usage Tip
- Caching Mechanism: In high - concurrency scenarios, enabling caching can significantly improve performance.
- Error Handling: It is recommended to add comprehensive error - handling logic for network requests and parsing steps.
- Logging: Adding logging functionality can facilitate problem troubleshooting.
đ¤ Contributing
DOCRET welcomes contributions from the community. You can participate in the following ways:
- Submit bug reports.
- Create feature requests.
- Submit code PRs.
For more information, please visit the DORET Contribution Guide.
đ Contact Us
If you have any questions or suggestions, please contact our team:
- Email: contact@dorect.com
- GitHub: [https://github.com/doret - com/dorect - mcp](https://github.com/doret - com/dorect - mcp)
The DORET open - source project is maintained by the Doret Team, aiming to provide developers with an efficient and reliable document retrieval solution.







