Web-LLM-MCP Server: Local LLM Inference & Multi - function MCP Tool 解释：核心关键词为“Web-LLM-MCP Server”和“local LLM inference” ，按照英文习惯组织语言，简洁表达了标题核心内容，长度也符合要求。

Web Llm MCP Server

A local LLM inference MCP server based on Playwright and Web-LLM, realizing text generation, chat interaction, and model management functions through browser automation.

Artificial intelligence chatbots Developer tools #Browser LLM #Model Management #Text Generation #Automated Interaction .TypeScript

rating : 2 points

downloads : 5.1K

update time : 2025-07-24

Open Site

What is the Web-LLM MCP Server?

This is a service that runs a local large language model (LLM) in the browser through Playwright, allowing users to perform operations such as text generation and chat interactions via the web interface. It utilizes the @mlc-ai/web-llm library to achieve efficient inference.

How to use the Web-LLM MCP Server?

By starting the server and calling the provided tool interfaces, users can easily interact with the LLM. Functions include text generation, chat conversations, model switching, etc., which are suitable for development and testing scenarios.

Applicable Scenarios

Suitable for developers, researchers who need to run local LLMs in the browser, and application users who want to quickly test different models.

Main Features

Browser-side LLM Inference

Run a local LLM in the browser and complete text generation tasks without relying on external APIs.

Playwright Integration

Control the browser through the automation tool Playwright to achieve seamless interaction with the LLM.

Multi-model Support

Support multiple pre-trained models, such as Llama, Phi, Gemma, etc., and switch between them at any time.

Real-time Chat Interaction

Provide a chat interface to support multi-round conversations with the LLM.

Status Monitoring and Screenshots

View the current status information of the LLM and take interface screenshots for debugging.

Advantages

Run a local LLM directly in the browser without an internet connection

Support multiple models for easy testing of different effects

Easy to integrate into existing applications

Provide an intuitive user interface and interaction method

Limitations

Downloading model files for the first time may take a long time

Model switching requires re-initialization, increasing waiting time

Have certain requirements for hardware resources

Not suitable for large-scale deployment scenarios

How to Use

Install Dependencies

First, install the dependency packages required for the project.

Install the Browser

Ensure that the Chromium browser is installed.

Start the Server

Run the main program to start the MCP server.

Use the Tools

Interact with the LLM by calling the provided tool functions.

Usage Examples

Generate an Article

The user wants to generate an article about the development of artificial intelligence.

Multi-round Conversations

The user wants to have multi-round conversations with the AI to discuss a certain topic.

Model Switching

The user wants to try the effects of different models.

Frequently Asked Questions

Why is the first run so slow?

Does it support custom models?

Can it run in non-headless mode?

How to get the help documentation?

Related Resources

Official Documentation

Learn detailed information and usage methods of @mlc-ai/web-llm.

GitHub Repository

Get the source code and the latest version updates.

Tutorial Video

Watch the video tutorial to learn how to use the server.

🚀 Web-LLM MCP Server

An MCP server that uses Playwright to load and interact with an HTML page containing @mlc-ai/web-llm for local LLM inference.

🚀 Quick Start

To start the MCP server, run the following command:

node index.js

Or you can run the test:

node test.js

✨ Features

Browser-based LLM: Employs @mlc-ai/web-llm running within a Chromium browser instance.
Playwright Integration: Automates browser interactions to ensure seamless LLM operations.
Multiple Tools: Enables text generation, chatting, status checking, model changing, and screenshot taking.
Model Management: Supports various Web-LLM models with dynamic switching capabilities.