Ebook MCP

Ebook-MCP is an e-book processing server based on the Model Context Protocol (MCP), supporting EPUB and PDF formats, providing intelligent book management, interactive reading experience, and learning assistance functions to achieve natural language interaction with e-books.

Education and learning tools Knowledge management and memory #E-book processing #Intelligent reading #Natural language interaction #Learning assistance .Python

rating : 2.5 points

downloads : 13.6K

update time : 2025-04-29

Open Site

What is Ebook-MCP?

Ebook-MCP is an intelligent server that allows users to interact with e-books through natural language. It supports a variety of functions, including chapter query, content summarization, and knowledge Q&A.

Main Features

Ebook-MCP is built on the Model Context Protocol, supports the parsing of multi-format e-books, and provides rich interactive functions.

File Format Support

Supports two common e-book formats: EPUB and PDF.

Chapter Query Function

Users can quickly locate content by specifying the chapter ID.

Content Summarization Service

Provides a brief summary of any chapter to help users quickly understand the core content.

Interactive Learning Mode

Supports users to conduct Q&A interactions with e-book content to consolidate knowledge.

Installation and Configuration

Load E-books

Start the Service

What file formats does Ebook-MCP support?

Do I need programming knowledge to use it?

GitHub Repository

View the source code and issue tracking.

🚀 E-book Content Processing Platform (ebook-MCP)

A tool platform for processing and analyzing e-book files, supporting EPUB and PDF formats. It provides functions such as extracting metadata, obtaining table of contents, reading chapter content, and converting content to Markdown format. Additionally, it offers a server-side framework that can be run via commands.

🚀 Quick Start

✨ Features

This platform offers comprehensive functions for both EPUB and PDF file processing:

EPUB Processing Features

Retrieve all EPUB files in a specified directory.
Extract metadata information (e.g., title, author, publication date) from EPUB files.
Read the table of contents structure of EPUB files.
Obtain the content of specific chapters and convert it to Markdown format.

PDF Processing Features

Retrieve all PDF files in a specified directory.
Extract metadata information from PDF files.
Read the table of contents structure of PDF files.
Obtain the content of specific pages (supporting plain text and Markdown formats).
Get the corresponding content and its page number range based on chapter titles.

📦 Installation

Key Dependencies

ebooklib: A library for processing EPUB files.
PyPDF2: A basic PDF processing library.
PyMuPDF: An advanced PDF processing library.
beautifulsoup4: An HTML parsing tool.
html2text: A tool for converting HTML to Markdown format.
pydantic: A data validation framework.
fastmcp: An MCP server-side framework.

💻 Usage Examples

Basic Usage

EPUB Processing Example

# Get all EPUB files in a specified directory
epub_files = get_all_epub_files("/path/to/books")

# Extract metadata from a single EPUB file
metadata = get_metadata("/path/to/book.epub")

# Read the table of contents structure of an EPUB file
toc = get_toc("/path/to/book.epub")

# Get the content of a specific chapter (in Markdown format)
chapter_content = get_chapter_markdown("/path/to/book.epub", "chapter_id")

PDF Processing Example

# Get all PDF files in a specified directory
pdf_files = get_all_pdf_files("/path/to/books")

# Extract metadata from a single PDF file
metadata = get_pdf_metadata("/path/to/book.pdf")

# Read the table of contents structure of a PDF file
toc = get_pdf_toc("/path/to/book.pdf")

# Get the content of a specific page (in plain text format)
page_text = get_pdf_page_text("/path/to/book.pdf", 1)

# Get the content of a specific page (in Markdown format)
page_markdown = get_pdf_page_markdown("/path/to/book.pdf", 1)

# Get the corresponding content and its page number range based on a chapter title
chapter_content, page_numbers = get_pdf_chapter_content("/path/to/book.pdf", "Chapter 1")

📚 Documentation

API Reference

EPUB APIs

get_all_epub_files(path: str) -> List[str]: Retrieve all EPUB file paths in a specified directory.
get_metadata(epub_path: str) -> Dict[str, Union[str, List[str]]]: Extract metadata information from a specified EPUB file.
get_toc(epub_path: str) -> List[Tuple[str, str]]: Obtain the table of contents structure of a specified EPUB file, returning chapter titles and their corresponding IDs.
get_chapter_markdown(epub_path: str, chapter_id: str) -> str: Get the content of a specific chapter based on its ID and convert it to Markdown format.

PDF APIs

get_all_pdf_files(path: str) -> List[str]: Retrieve all PDF file paths in a specified directory.
get_pdf_metadata(pdf_path: str) -> Dict[str, Union[str, List[str]]]: Extract metadata information from a specified PDF file.
get_pdf_toc(pdf_path: str) -> List[Tuple[str, int]]: Obtain the table of contents structure of a specified PDF file, returning chapter titles and their corresponding page positions.
get_pdf_page_text(pdf_path: str, page_number: int) -> str: Get the corresponding content (in plain text format) based on a specified page number.
get_pdf_page_markdown(pdf_path: str, page_number: int) -> str: Get the corresponding content (in Markdown format) based on a specified page number.