๐ MCP and HTTP Server Wrapper for PyAutoGUI
This project is an MCP and HTTP server wrapper for PyAutoGUI, which enables LLMs to control your mouse and keyboard.
๐ Quick Start
Prerequisites
- Python >= 3.12
uvpackage manager (recommended)
Installation
- Clone the repository:
git clone https://github.com/stonehill-2345/mcp-autogui-multinode.git
cd mcp-autogui-multinode
- Install dependencies based on your deployment scenario:
Local Full Development
For local development with all features (GUI control + testing):
uv sync --group gui --group dev
Deploy MCP Server Only
For deploying MCP server that connects to remote tool service (no GUI dependencies needed):
uv sync --no-group gui
Deploy Tool Service Only
For deploying HTTP tool service that performs actual computer control (requires GUI):
uv sync --group gui
Running the Service
The service supports two independent servers:
1. Run Tool Service (HTTP API)
Starts the HTTP API server for computer control:
uv run python tool.py
2. Run MCP Server
Starts the MCP server that can connect to remote tool services. The server supports two transport modes: HTTP Transport Mode:
uv run python mcp_local.py http
stdio Transport Mode (default):
uv run python mcp_local.py stdio
After starting, you can access:
- HTTP API Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
- MCP Endpoint: http://localhost:8001/mcp (if using HTTP transport)
โจ Features
- ๐ Dual Protocol Support: HTTP REST API and MCP (Model Context Protocol)
- ๐ API Key Authentication: Optional API key authentication for service-to-service communication
- ๐ Multiple MCP Transports: Support for both HTTP and stdio (Standard Input/Output) transport modes
- ๐ฑ๏ธ Mouse Control: Move, click, drag, scroll operations
- โจ๏ธ Keyboard Control: Press keys, type text, key combinations
- ๐ธ Screenshot: Capture screen and get base64-encoded images
- ๐ Screen Info: Get cursor position and screen resolution
- โ๏ธ Configuration Management: Pydantic Settings with environment variable support
- ๐ Auto Documentation: Swagger UI for HTTP API
- ๐ง Flexible Deployment: Run HTTP server or MCP server independently
- ๐ Request Tracing: Request ID middleware for request tracking
- ๐ Structured Logging: Loguru-based logging with request ID integration
- ๐ Remote MCP Support: Optional HTTP client for remote tool server integration
๐ Documentation
Architecture
The service supports two deployment architectures:
LLM -> MCP -> TOOL (Remote Tool Service)
This architecture separates the MCP server from the tool service, allowing the MCP server to connect to a remote tool service via HTTP.
graph LR
LLM[LLM Client] -->|MCP Protocol| MCP[MCP Server<br/>main.py<br/>Client-based Tools]
MCP -->|HTTP API<br/>with API Key| TOOLA[Tool Service<br/>tool.py<br/>HTTP API Server]
MCP -->|HTTP API<br/>with API Key| TOOLB[Tool Service<br/>tool.py<br/>HTTP API Server]
TOOLA -->|PyAutoGUI| COMPUTERA[Computer Control]
TOOLB -->|PyAutoGUI| COMPUTERB[Computer Control]
style LLM fill:#e1f5ff
style MCP fill:#fff4e1
style TOOLA fill:#ffe1f5
style COMPUTERA fill:#e1ffe1
style COMPUTERB fill:#e1ffe1
Characteristics:
- MCP server uses client-based tools (
register_computer_tools_with_client) - MCP server forwards requests to remote tool service via HTTP
- Tool service performs actual computer control operations
- Suitable for distributed deployments where MCP server and tool service run on different machines
- Requires
endpointparameter in MCP tool calls
Architecture 2: LLM -> MCP (Direct Tools)
This architecture uses direct tools where the MCP server directly performs computer control operations.
graph LR
LLM[LLM Client] -->|MCP Protocol<br/>stdio/http| MCP[MCP Server<br/>mcp_local.py<br/>Direct Tools]
MCP -->|PyAutoGUI| COMPUTER[Computer Control]
style LLM fill:#e1f5ff
style MCP fill:#fff4e1
style COMPUTER fill:#e1ffe1
Characteristics:
- MCP server uses direct tools (
register_computer_tools) - MCP server directly executes computer control operations
- No separate tool service required
- Suitable for local deployments where everything runs on the same machine
- No
endpointparameter needed in MCP tool calls
๐ป Usage Examples
Example API Usage
Move Mouse
curl -X POST "http://localhost:8000/api/computer/MoveMouse" \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key-here" \
-d '{"x": 100, "y": 200}'
Click Mouse
curl -X POST "http://localhost:8000/api/computer/ClickMouse" \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key-here" \
-d '{"x": 100, "y": 200, "button": "left"}'
Take Screenshot
curl -X POST "http://localhost:8000/api/computer/TakeScreenshot" \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key-here" \
-d '{}'
Get Cursor Position
curl -X POST "http://localhost:8000/api/computer/GetCursorPosition" \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-api-key-here" \
-d '{}'
Note: If API_KEY_ENABLED=false, the X-API-Key header is optional. If API_KEY_ENABLED=true, the header is required for all requests except health checks and documentation endpoints.
MCP Client Usage with API Key
from fastmcp import Client
from fastmcp.client.transports import StreamableHttpTransport
# Create transport with API key
transport = StreamableHttpTransport(
url="http://localhost:8001/mcp",
headers={"X-API-Key": "your-secret-api-key-here"}
)
# Create client with transport
client = Client(transport)
async with client:
response = await client.call_tool("move_mouse", {"x": 100, "y": 200})
๐ฆ API Endpoints
Base Endpoints
GET /- Root path, returns API informationGET /health- Health check endpoint
Computer Control Endpoints
All computer control actions are available at:
POST /api/computer/{action}- Execute a computer control actionGET /api/computer/actions- List all available actions
Available Actions
| Action | Description | Parameters |
|---|---|---|
MoveMouse |
Move mouse cursor | x, y (coordinates) |
ClickMouse |
Click mouse button | x, y, button, press, release |
PressMouse |
Press mouse button (hold) | x, y, button |
ReleaseMouse |
Release mouse button | x, y, button |
DragMouse |
Drag mouse from source to target | source_x, source_y, target_x, target_y |
Scroll |
Scroll mouse wheel | scroll_direction, scroll_amount, x, y |
PressKey |
Press keyboard key(s) | key (e.g., "enter", "ctrl c") |
TypeText |
Type text (uses clipboard) | text |
Wait |
Wait for duration | duration (milliseconds) |
TakeScreenshot |
Capture screen | (no parameters) |
GetCursorPosition |
Get mouse position | (no parameters) |
GetScreenSize |
Get screen resolution | (no parameters) |
๐ง MCP Tools
Available MCP Tools
All HTTP API actions are available as MCP tools. The MCP tool names use snake_case, while the HTTP API uses PascalCase:
move_mouse- Move mouse cursor (HTTP:MoveMouse)click_mouse- Click mouse button (HTTP:ClickMouse)press_mouse- Press mouse button (HTTP:PressMouse)release_mouse- Release mouse button (HTTP:ReleaseMouse)drag_mouse- Drag mouse (HTTP:DragMouse)scroll- Scroll mouse wheel (HTTP:Scroll)press_key- Press keyboard key (HTTP:PressKey)type_text- Type text (HTTP:TypeText)wait- Wait for duration (HTTP:Wait)take_screenshot- Take screenshot (HTTP:TakeScreenshot)get_cursor_position- Get cursor position (HTTP:GetCursorPosition)get_screen_size- Get screen size (HTTP:GetScreenSize)
MCP Transport Modes
The MCP server supports two transport modes:
- stdio (default): Standard input/output transport
- Used for local communication via stdin/stdout
- Suitable for direct integration with MCP clients
- Start with:
python mcp_local.py stdio
- http: HTTP-based transport with stateless mode
- Used for remote communication over HTTP
- Suitable for service-to-service communication
- Start with:
python mcp_local.py http - Accessible at:
http://localhost:8001/mcp
MCP Tool Registration Modes
The service supports two modes of MCP tool registration:
- Direct Tools (
register_computer_tools): Tools that directly call the local computer control implementation. Noendpointparameter required.- Used in
mcp_local.pyfor local MCP server - Tools execute computer control actions directly
- Used in
- Client-based Tools (
register_computer_tools_with_client): Tools that use an HTTP client to call a remote tool server. Requires anendpointparameter.- Used in
mcp_server/register.pyfor remote MCP server - Tools forward requests to a remote tool service via HTTP
- Used in
The local MCP server (mcp_local.py) uses direct tools by default. The remote MCP server uses client-based tools.
๐ Security Considerations
โ ๏ธ Important Note
This service provides direct control over your computer's mouse and keyboard. Use with caution:
- Only run on trusted networks
- Restrict CORS origins in production (currently allows all origins)
- Enable API Key Authentication: Set
API_KEY_ENABLED=trueand configure a strongAPI_KEYin production- Be aware of the security implications of remote computer control
API Key Authentication
The service supports optional API key authentication for securing service-to-service communication:
- Enable Authentication: Set
API_KEY_ENABLED=truein your.envfile - Set API Key: Configure
API_KEY=your-secret-api-key-herein your.envfile - Pass API Key in Requests: Include the API key in request headers:
X-API-Key: your-secret-api-key-here(recommended)Authorization: Bearer your-secret-api-key-here(alternative)
Excluded Paths (no authentication required):
/health- Health check endpoint/docs- API documentation/openapi.json- OpenAPI schema/redoc- Alternative API documentation
๐ Logging
The service uses Loguru for structured logging with the following features:
- Request ID Tracking: Each request gets a unique ID that appears in all log entries
- Environment-aware: Console output in development, file logging in production
- Structured Format: Includes timestamp, level, request ID, module, function, and line number
Log files are stored in the logs/ directory:
app_YYYY-MM-DD.log: General application logserror_YYYY-MM-DD.log: Error logs only
In development mode, logs are only output to the console. In production mode, logs are written to both console and files.
๐งช Testing
Run tests using pytest:
# Run all tests
uv pytest
# Run specific test file
uvpytest tests/test_mcp_client.py
# Run with verbose output
pytest -v
The test suite includes:
test_local_mcp_client.py: Tests for local MCP server with HTTP transport (direct tools)test_stdio_mcp_client.py: Tests for local MCP server with stdio transport (direct tools)test_mcp_client.py: Tests for remote MCP server with client-based tools (requires endpoint parameter)
โ๏ธ Troubleshooting
Port Already in Use
If you get a port already in use error:
# Change ports in .env file
PORT=8002
MCP_PORT=8003
MCP Connection Issues
For HTTP transport, ensure the MCP server is running and accessible:
# Test MCP endpoint
curl http://localhost:8001/mcp
# Test with API key (if enabled)
curl -H "X-API-Key: your-secret-api-key-here" http://localhost:8001/mcp
For stdio transport, ensure the MCP server is started with stdio mode:
# Start MCP server in stdio mode
uv python mcp_local.py stdio
API Key Authentication Issues
If you're getting authentication errors:
- Verify
API_KEY_ENABLEDis set correctly in.env - Check that
API_KEYmatches between client and server - Ensure the API key is passed in the
X-API-Keyheader orAuthorization: Bearer <key>header - Check that the request path is not in the excluded paths list
Screenshot Issues
If screenshot functionality fails:
- Check Python version compatibility (requires Python >= 3.12)
- Verify display permissions on macOS/Linux
- Ensure PyAutoGUI and its dependencies are properly installed
๐ License
MIT license
๐ฅ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.













