Local Wikipedia
L

Local Wikipedia

Local-Wikipedia is an MCP server that makes Wikipedia offline through a one-time download. It supports full-text search and offline reading, is designed for small local LLMs, and provides a fast and low-memory search experience.
2.5 points
0

What is Local-Wikipedia?

Local-Wikipedia is a Model Context Protocol (MCP) server that allows you to download the entire Wikipedia to your local computer. After the download is complete, you can search for and read Wikipedia articles without an internet connection. It is especially suitable for use with small local language models (LLMs), providing a reliable offline knowledge base for AI assistants.

How to use Local-Wikipedia?

It's very simple to use: Start it with a single click via Docker Compose. The server will automatically download the Wikipedia data in the specified language and build an index. After that, your AI assistant (such as Claude Desktop) can connect to it via the MCP protocol and use the search and random reading functions. The whole process doesn't require complex configuration.

Applicable scenarios

1. Offline research: Find information in an environment without a network. 2. Privacy protection: Avoid sending sensitive queries to online APIs. 3. High-frequency search: Not limited by API rate limits and can search repeatedly. 4. Enhancement of small LLMs: Provide knowledge support for small models running locally. 5. Educational use: Provide stable reference materials in classrooms or laboratories.

Main features

Intelligent full-text search
It not only supports exact title matching but also full-text content search. You can find relevant articles even if you can't remember the full title.
Fully offline use
The data is permanently stored locally after a one-time download. You can access all functions without an internet connection.
Intelligent query correction
Automatically clean up redundant or incorrect query terms that the AI assistant may generate, improving search accuracy.
Random article reading
You can randomly get an article from the Wikipedia in the specified language for exploratory learning or testing.
Multi-language support
It supports multiple language versions of Wikipedia, including Chinese, English, Japanese, etc.
Efficient indexing technology
It uses PGroonga for fast full-text indexing, with fast search response and low memory usage.
Advantages
🔍 Real full-text search ability, independent of external APIs
📶 Works completely offline, no network needed after download
⚡ No API rate limits, supports high-frequency searches
🤖 Optimized for small LLMs, with concise and efficient queries
🔧 Easy to expand new functions, and the data is fully localized
🌐 Supports multiple language versions of Wikipedia
Limitations
⏳ The initial download and indexing time is long (it may take several hours for the English version)
💾 Requires enough disk space to store the data
🔄 The data is not updated in real-time and needs to manually update the dataset
🔒 The current version is not suitable for direct public API services
⚙️ Requires a Docker environment to run

How to use

Environment preparation
Make sure your computer has Docker and Docker Compose installed. This is a prerequisite for running Local-Wikipedia.
Download the project
Clone the Local-Wikipedia project from GitHub to your local machine.
Configure the language
Modify the language settings in the config.yaml file as needed (the default is Japanese).
Start the service
Start the service using Docker Compose. The first time you run it, it will automatically download and index the Wikipedia data.
Configure the AI assistant
Add the Local-Wikipedia server to the MCP configuration of your AI assistant (such as Claude Desktop).

Usage examples

Academic research assistance
Quickly find the definitions and historical developments of relevant concepts when writing a paper.
Offline learning tool
Conduct self-study in an environment without a network (such as on a plane or in a remote area).
Knowledge enhancement for AI assistants
Provide accurate fact-checking capabilities for small local language models.
Random knowledge exploration
Discover new knowledge through the random reading function and stimulate learning interest.

Frequently Asked Questions

How long does it take to download the Wikipedia data?
How much disk space is required?
Will the data be updated regularly?
Which languages are supported?
Can it be publicly used in a production environment?
How to change the default port?

Related resources

GitHub project repository
The source code and the latest version of Local-Wikipedia
MCP protocol documentation
The official specification of the Model Context Protocol
Wikipedia dataset
The Markdown-formatted Wikipedia data used by Local-Wikipedia
Docker installation guide
The installation tutorial for Docker and Docker Compose
PGroonga documentation
The full-text search engine technology used by Local-Wikipedia
Wikipedia terms of use
The CC BY-SA 4.0 license for using Wikipedia content

Installation

Copy the following command to your Client for configuration
Note: Your key is sensitive information, do not share it with anyone.

Alternatives

C
Claude Context
Claude Context is an MCP plugin that provides in - depth context of the entire codebase for AI programming assistants through semantic code search. It supports multiple embedding models and vector databases to achieve efficient code retrieval.
TypeScript
5.8K
5 points
A
Acemcp
Acemcp is an MCP server for codebase indexing and semantic search, supporting automatic incremental indexing, multi-encoding file processing, .gitignore integration, and a Web management interface, helping developers quickly search for and understand code context.
Python
9.9K
5 points
M
MCP
The Microsoft official MCP server provides search and access functions for the latest Microsoft technical documentation for AI assistants
13.1K
5 points
C
Cipher
Cipher is an open-source memory layer framework designed for programming AI agents. It integrates with various IDEs and AI coding assistants through the MCP protocol, providing core functions such as automatic memory generation, team memory sharing, and dual-system memory management.
TypeScript
0
5 points
A
Annas MCP
The MCP server and CLI tool of Anna's Archive are used to search for and download documents on the platform and support access through an API key.
Go
10.6K
4.5 points
S
Search1api
The Search1API MCP Server is a server based on the Model Context Protocol (MCP), providing search and crawling functions, and supporting multiple search services and tools.
TypeScript
15.5K
4 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
54.9K
4.3 points
B
Bing Search MCP
An MCP server for integrating Microsoft Bing Search API, supporting web page, news, and image search functions, providing network search capabilities for AI assistants.
Python
17.1K
4 points
N
Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
17.5K
4.5 points
G
Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
17.5K
4.3 points
D
Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
54.9K
4.3 points
M
Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
27.6K
5 points
F
Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
51.3K
4.5 points
U
Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
24.3K
5 points
G
Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
17.2K
4.5 points
M
Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
36.7K
4.8 points
AIBase
Zhiqi Future, Your AI Solution Think Tank
© 2025AIBase