Search Stack
S

Search Stack

Search Stack is a web search and scraping middleware layer specifically designed for AI Agents, providing a unified API for multi-engine search, anti-crawler rendering, Cookie management, and main content extraction. It aims to solve problems such as quota limitations, anti-crawler interception, and lack of login states that AI encounters when accessing web pages.
2.5 points
7.7K

What is Search Stack?

Search Stack is a web search and content scraping middleware service specifically designed for AI Agents. It solves various problems that AI encounters when accessing web pages, such as search engine restrictions, anti-crawler interception, and inability to access websites that require login. Through a unified API interface, AI Agents can easily search web pages and scrape full-text content, even dealing with complex anti-crawler mechanisms and login requirements.

How to use Search Stack?

Search Stack provides two main usage methods: 1) Integrate it as a native plugin into AI platforms such as OpenClaw, and AI can directly call search and scraping tools; 2) Use it in MCP Server mode for various AIs that support the MCP protocol. After deployment, AI can complete operations such as web page search, content scraping, and Cookie management through simple API calls.

Applicable Scenarios

Search Stack is particularly suitable for the following scenarios: • AI needs to search for the latest information for knowledge updates. • It is necessary to scrape content that requires login to access (e.g., Zhihu, Xiaohongshu). • The target website has complex anti-crawler mechanisms (e.g., Cloudflare). • Multiple search engines need to be used simultaneously, and the best results are automatically selected. • Multiple AI Agents in a team need to share search and scraping capabilities.

Main Features

Intelligent Multi-engine Switching
Automatically switch between three search engines: Tavily, Serper, and SearXNG. A single engine failure does not affect the service. SearXNG is completely free and unlimited, ensuring high service availability.
Intelligent Anti-anti-crawler
Built-in Browserless headless Chrome, enabling Stealth mode to bypass anti-crawler detection such as Cloudflare. Supports scraping JavaScript-rendered pages.
Dynamic Cookie Management
Provides a complete Cookie management API, supporting two ways to obtain Cookies: manual pasting and remote browser login. Cookies are automatically injected into scraping requests.
Remote Browser Login (Cookie Catcher)
Remotely control Chrome through the Web UI to complete complex login processes (e.g., OAuth, QR code scanning), and save Cookies with one click. Supports mouse, keyboard, and touch screen operations.
Intelligent Login Detection
Detect whether a page requires login from multiple dimensions: HTTP status code, text keywords, page title, HTML structure, etc. Automatically guide users to provide Cookies.
Precise Main Content Extraction
Combine three engines: trafilatura, BeautifulSoup, and readability to precisely extract the main content of web pages, removing irrelevant content such as advertisements and navigation.
SSRF Security Protection
Built-in private IP blacklist, rejecting access to internal network addresses to prevent AI from being induced to access internal systems.
Intelligent Caching
Redis caches search results and web page content with a 15-minute TTL. Repeated queries can be returned within 13ms, significantly improving the response speed.
MCP Server Support
Provides an MCP Server in stdio mode, which can be registered through mcporter for use by AI platforms such as OpenClaw that support MCP.
Social Media API Integration
Optionally integrate the TikHub social media API, supporting content acquisition from 803 social platforms such as Douyin, TikTok, and Weibo.
HTTP/SOCKS5 Proxy Support
Supports accessing blocked websites (e.g., YouTube) through a proxy or using a fixed IP to deal with anti-crawlers.
Advantages
Excellent Chinese search quality: Compared with Brave Search, the Chinese results are more abundant (e.g., Juejin, Zhihu, Smzdm).
High availability: Three engines with automatic fallback, and a single-point failure does not affect the service.
Comprehensive functions: Integration of search and scraping, supporting Cookie injection, anti-crawler bypass, and login detection.
Cost advantage: SearXNG is completely free and unlimited, significantly reducing API costs.
Flexible deployment: Supports local and remote deployment, and can be shared among multiple machines.
Fast response: Redis caching reduces the response time of repeated queries to as low as 13ms.
High security: Built-in SSRF protection, API authentication, and rate-limiting mechanisms.
Limitations
Deployment complexity: Requires a Docker environment and many configuration steps.
Resource consumption: Each Browserless Chrome session occupies about 400 - 500MB of memory.
Chrome proxy limitation: HTTP/SOCKS5 proxies with authentication cannot be used in Chrome rendering.
Learning curve: Cookie management and remote login require a certain learning cost.
Maintenance requirements: Requires regular Cookie updates and service status monitoring.

How to Use

Environment Preparation
Ensure that Docker and Docker Compose are installed on the system. Obtain optional search engine API Keys (Tavily, Serper).
Clone the Project and Configure
Clone the project repository, copy the environment variable template, and configure the necessary keys. Pay special attention to configuring SearXNG's JSON API support.
Start the Service
Use Docker Compose to start all services and wait for the containers to be in a healthy state.
Integrate into the AI Platform
Select the integration method according to the AI platform used: native plugin (recommended) or MCP Server. Configure the plugin and create a Skill file.
Test and Verify
Test the search and scraping functions through API calls to verify the success of the integration.

Usage Examples

Search for Technical Articles and Get the Full Text
AI needs to understand the latest Docker best practices, search for relevant articles, and directly obtain the full content for analysis.
Scrape a Zhihu Column that Requires Login
The user wants to understand the content of a paid article in a Zhihu column, but needs to log in to view the full article.
Bypass Anti-crawlers to Obtain Product Information
It is necessary to obtain product price information from an e-commerce website, but the website has strict anti-crawler mechanisms.
Multi-source Information Comparison and Research
Research a technical topic and need to obtain information from multiple sources for comparative analysis.

Frequently Asked Questions

What should I do if SearXNG search returns a 403 or empty result?
What should I do if AI does not use search-stack and still uses the built-in Brave search?
What should I do if scraping SPA websites such as Threads/Instagram fails?
What should I do if Browserless Chrome times out or crashes?
How can I obtain Cookies for websites that require login?
What should I do if the AI behavior does not change after updating SKILL.md?
Does it support remote deployment? OpenClaw and Search Stack are on different machines.
What should I do if Chrome rendering does not support proxies with authentication?

Related Resources

GitHub Repository
Search Stack project source code and latest documentation
OpenClaw Official Website
Official website of the OpenClaw AI platform
Tavily API
Tavily search engine API service
Serper API
Serper (Google) search engine API
SearXNG Documentation
Official documentation of the SearXNG meta-search engine
TikHub API
TikHub social media API platform
Model Context Protocol
Official specification of the MCP protocol

Installation

Copy the following command to your Client for configuration
{
  "mcpServers": {
    "search-stack": {
      "command": "/home/your_user/.bun/bin/bun",
      "args": ["run", "/opt/search-stack/proxy/mcp-server.ts"],
      "keepAlive": true,
      "env": {
        "SEARCH_STACK_URL": "http://127.0.0.1:17080",
        "SEARCH_STACK_API_KEY": "your_proxy_api_key",
        "TIKHUB_API_KEY": "your_tikhub_key"
      }
    }
  }
}
Note: Your key is sensitive information, do not share it with anyone.

Alternatives

A
Airweave
Airweave is an open - source context retrieval layer for AI agents and RAG systems. It connects and synchronizes data from various applications, tools, and databases, and provides relevant, real - time, multi - source contextual information to AI agents through a unified search interface.
Python
6.2K
5 points
V
Vestige
Vestige is an AI memory engine based on cognitive science. By implementing 29 neuroscience modules such as prediction error gating, FSRS - 6 spaced repetition, and memory dreaming, it provides long - term memory capabilities for AI. It includes a 3D visualization dashboard and 21 MCP tools, runs completely locally, and does not require the cloud.
Rust
4.9K
4.5 points
M
Moltbrain
MoltBrain is a long-term memory layer plugin designed for OpenClaw, MoltBook, and Claude Code, capable of automatically learning and recalling project context, providing intelligent search, observation recording, analysis statistics, and persistent storage functions.
TypeScript
5.5K
4.5 points
B
Bm.md
A feature-rich Markdown typesetting tool that supports multiple style themes and platform adaptation, providing real-time editing preview, image export, and API integration capabilities
TypeScript
3.9K
5 points
S
Security Detections MCP
Security Detections MCP is a server based on the Model Context Protocol that allows LLMs to query a unified security detection rule database covering Sigma, Splunk ESCU, Elastic, and KQL formats. The latest version 3.0 is upgraded to an autonomous detection engineering platform that can automatically extract TTPs from threat intelligence, analyze coverage gaps, generate SIEM-native format detection rules, run tests, and verify. The project includes over 71 tools, 11 pre-built workflow prompts, and a knowledge graph system, supporting multiple SIEM platforms.
TypeScript
6.3K
4 points
P
Paperbanana
Python
7.5K
5 points
B
Better Icons
An MCP server and CLI tool that provides search and retrieval of over 200,000 icons, supports more than 150 icon libraries, and helps AI assistants and developers quickly obtain and use icons.
TypeScript
6.1K
4.5 points
A
Assistant Ui
assistant - ui is an open - source TypeScript/React library for quickly building production - grade AI chat interfaces, providing composable UI components, streaming responses, accessibility, etc., and supporting multiple AI backends and models.
TypeScript
7.6K
5 points
G
Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
24.7K
4.3 points
M
Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
34.6K
5 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
72.6K
4.3 points
N
Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
20.5K
4.5 points
F
Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
63.8K
4.5 points
U
Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
31.5K
5 points
G
Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
21.1K
4.5 points
M
Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
49.5K
4.8 points
AIBase
Zhiqi Future, Your AI Solution Think Tank
© 2026AIBase