๐ better-playwright-mcp
A better Playwright MCP (Model Context Protocol) server that uses a client-server architecture for browser automation.
๐ Quick Start
To get started with better-playwright-mcp, you need to install it globally and then start both the MCP server and the HTTP server.
Installation
npm install -g better-playwright-mcp
Starting the Servers
- Start the HTTP server
npx better-playwright-mcp server
- In another terminal, start the MCP server
npx better-playwright-mcp
โจ Features
- ๐ฏ 90% token reduction through semantic HTML snapshots
- ๐ญ Full Playwright browser automation via MCP
- ๐๏ธ Client-server architecture for better separation of concerns
- ๐ก๏ธ Stealth mode to avoid detection
- ๐ Hash-based element identifiers for precise targeting
- ๐พ Persistent browser profiles
- ๐ Optimized for long-running automation tasks
- ๐ Token-aware output with automatic truncation
๐ฆ Installation
npm install -g better-playwright-mcp
๐ป Usage Examples
Default Mode (MCP)
The MCP server requires an HTTP server to be running. You need to start both:
Step 1: Start the HTTP server
npx better-playwright-mcp server
Step 2: In another terminal, start the MCP server
npx better-playwright-mcp
The MCP server will:
- Start listening on stdio for MCP protocol messages
- Connect to the HTTP server on port 3102
- Route browser automation commands through the HTTP server
Options:
--snapshot-dir <path>- Directory to save snapshots
Standalone HTTP Server Mode
You can also run the HTTP server independently (useful for debugging or custom integrations):
npx better-playwright-mcp server
Options:
-p, --port <number>- Server port (default: 3102)--host <string>- Server host (default: localhost)--headless- Run browser in headless mode--chromium- Use Chromium instead of Chrome--no-user-profile- Do not use persistent user profile--user-data-dir <path>- User data directory--snapshot-dir <path>- Directory to save snapshots
๐ Documentation
Why Better?
Traditional browser automation tools send entire page HTML to AI assistants, which quickly exhausts token limits and makes complex web interactions impractical. better-playwright-mcp solves this with an innovative semantic snapshot algorithm that reduces page content by up to 90% while preserving all meaningful elements.
The Problem
- Full page HTML often exceeds 100K+ tokens
- Most HTML is noise: inline styles, tracking scripts, invisible elements
- AI assistants have limited context windows (even with 200K limits)
- Complex web automation becomes impossible after just a few page loads
Our Solution: Semantic Snapshots
Our core innovation is a multi-stage pruning algorithm that:
- Identifies meaningful elements - Interactive elements (buttons, inputs), semantic HTML5 tags, and text-containing elements
- Generates unique identifiers - Each element gets a hash-based
xpattribute derived from its XPath for precise targeting - Removes invisible content - Elements with
display:none, zero dimensions, or hidden parents are marked and removed - Unwraps useless wrappers - Eliminates divs and spans that only wrap other elements
- Strips unnecessary attributes - Keeps only essential attributes like
href,value,placeholder
Result: A clean, semantic representation that typically uses only 10% of the original tokens while maintaining full functionality.
Architecture
This project implements a unique two-tier architecture:
- MCP Server - Communicates with AI assistants via Model Context Protocol
- HTTP Server - Runs in the background to control the actual browser instances
AI Assistant <--[MCP Protocol]--> MCP Server <--[HTTP]--> HTTP Server <---> Browser
This design allows the MCP server to remain lightweight while delegating browser control to a dedicated HTTP service.
MCP Tools
When used with AI assistants, the following tools are available:
Page Management
createPage- Create a new browser page with name and descriptionactivatePage- Activate a specific page by IDclosePage- Close a specific pagelistPages- List all managed pages with titles and URLscloseAllPages- Close all managed pageslistPagesWithoutId- List unmanaged browser pagesclosePagesWithoutId- Close all unmanaged pagesclosePageByIndex- Close page by index
Browser Actions
browserClick- Click an element using itsxpidentifierbrowserType- Type text into an elementbrowserHover- Hover over an elementbrowserSelectOption- Select options in a dropdownbrowserPressKey- Press keyboard keysbrowserFileUpload- Upload files to file inputbrowserHandleDialog- Handle browser dialogs (alert, confirm, prompt)browserNavigate- Navigate to a URLbrowserNavigateBack- Go back to previous pagebrowserNavigateForward- Go forward to next pagescrollToBottom- Scroll to bottom of page/elementscrollToTop- Scroll to top of page/elementwaitForTimeout- Wait for specified millisecondswaitForSelector- Wait for element to appear
Snapshot & Utilities
getPageSnapshot- Get semantic HTML snapshot withxpidentifiersgetScreenshot- Take a screenshot (PNG/JPEG)getPDFSnapshot- Generate PDF of the pagegetElementHTML- Get HTML of specific elementdownloadImage- Download image from URLcaptureSnapshot- Capture full page with automatic scrolling
How It Works
Semantic Snapshots in Action
Before (original HTML):
<div class="wrapper" style="padding: 20px; margin: 10px;">
<div class="container">
<div class="inner">
<button class="btn btn-primary" onclick="handleClick()"
style="background: blue; color: white;">
Click me
</button>
</div>
</div>
</div>
After (semantic snapshot):
button xp=3fa2b8c1 Click me
The algorithm:
- Removes unnecessary wrapper divs
- Strips inline styles and event handlers
- Adds unique identifier (
xpattribute) - a hash of the element's XPath - Preserves only meaningful content
Diff-Based Optimization
To reduce data transfer and token usage:
- First snapshot is always complete
- Subsequent snapshots only include changes (diffs)
- Automatic caching for performance
Stealth Features
Browser instances are configured with:
- Custom user agent strings
- Disabled automation indicators
- WebGL vendor spoofing
- Canvas fingerprint protection
๐ง Technical Details
Development
Prerequisites
- Node.js >= 18.0.0
- TypeScript
- Chrome or Chromium browser
Building from Source
# Clone the repository
git clone https://github.com/yourusername/better-playwright-mcp.git
cd better-playwright-mcp
# Install dependencies
npm install
# Build the project
npm run build
# Run in development mode
npm run dev
Project Structure
better-playwright-mcp/
โโโ src/
โ โโโ index.ts # MCP mode entry point
โ โโโ server.ts # HTTP server mode entry point
โ โโโ playwright-mcp.ts # MCP server implementation
โ โโโ client/
โ โ โโโ playwright-client.ts # HTTP client for MCPโHTTP communication
โ โโโ server/
โ โ โโโ playwright-server.ts # HTTP server controlling browsers
โ โโโ extractor/
โ โ โโโ parse2.ts # HTML parsing with xp identifier generation
โ โ โโโ simplify-html.ts # HTML simplification
โ โ โโโ utils.ts # Extraction utilities
โ โโโ utils/
โ โโโ token-limiter.ts # Token counting and limiting
โโโ bin/
โ โโโ cli.js # CLI entry point
โโโ package.json
โโโ tsconfig.json
โโโ CLAUDE.md # Instructions for AI assistants
โโโ README.md
Troubleshooting
Common Issues
-
MCP server not connecting
- Ensure the HTTP server is accessible on port 3102
- Check firewall settings
- Try running with
DEBUG=* npx better-playwright-mcp
-
Browser not launching
- Ensure Chrome or Chromium is installed
- Try using
--chromiumflag - Check system resources
-
Token limit exceeded
- Snapshots are automatically truncated to 20,000 tokens
- Use targeted selectors to reduce snapshot size
- Consider using screenshot instead of snapshot for visual inspection
Debug Mode
Enable detailed logging:
DEBUG=* npx better-playwright-mcp
Logs and Records
Operation records are saved to:
- macOS/Linux:
/tmp/playwright-records/ - Windows:
%TEMP%\playwright-records\
Each page has its own directory with timestamped operation logs.
๐ License
This project is licensed under the MIT license.
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.









