Omniparser Autogui MCP
This is an MCP server project based on OmniParser, which can automatically analyze screen content and operate the GUI interface, mainly supporting the Windows system.
rating : 2.5 points
downloads : 37
What is Omniparser-Autogui-MCP?
Omniparser-Autogui-MCP is an MCP server that uses OmniParser to analyze screen content and operate the graphical user interface (GUI) through automated scripts. It is especially suitable for scenarios that require efficient screen parsing and automated operations.How to use Omniparser-Autogui-MCP?
After installation, simply add server settings to the configuration file to start using it. You can specify the target window, language, and other options through configuration parameters.Applicable scenarios
This tool is suitable for scenarios that require automated GUI operations, such as batch processing tasks, automated testing, data collection, etc.Main features
Screen content analysisUse OmniParser to parse elements such as text, images, and buttons on the screen.
Automated GUI operationsAutomatically perform operations such as mouse clicks and keyboard inputs based on the analysis results.
Multilingual supportSupport screen content parsing in multiple languages to meet internationalization requirements.
Flexible configurationAllow customization of the target window, language environment, and OmniParser model path.
Advantages and limitations
Advantages
Powerful screen parsing ability
Efficient automated operations
Support for multiple operating systems and languages
Open source and free
Limitations
Support for complex GUIs may be limited
Dependent on the performance of OmniParser
Requires a certain technical background for initial configuration
How to use
Clone the repository
Run the following command to clone and initialize the project: `git clone --recursive https://github.com/NON906/omniparser-autogui-mcp.git`.
Install dependencies
After switching to the project directory, run `uv sync` and `uv run download_models.py` to download the required models.
Configure the server
Edit the `claude_desktop_config.json` file and add server settings.
Usage examples
Search for 'MCP server' in the browserBy configuring the target window name, automatically locate and click the search box in the browser, enter the keyword 'MCP server', and then press the Enter key.
Automatically fill out formsAfter configuring the specific window name, automatically enter the username and password and click the login button.
Frequently Asked Questions
How to solve the installation failure problem?
Does it support Chinese?
How to debug automated scripts?
Related resources
OmniParser official documentation
Core documentation and tutorials for OmniParser.
GitHub repository
Source code and examples for Omniparser-Autogui-MCP.
YouTube tutorial
Quick start video.
Featured MCP Services

Notion Api MCP
Certified
A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.
Python
141
4.5 points

Gitlab MCP Server
Certified
The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.
TypeScript
86
4.3 points

Markdownify MCP
Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.
TypeScript
1.7K
5 points

Duckduckgo MCP Server
Certified
The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.
Python
830
4.3 points

Figma Context MCP
Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.
TypeScript
6.7K
4.5 points

Unity
Certified
UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.
C#
567
5 points

Minimax MCP Server
The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.
Python
754
4.8 points

Gmail MCP Server
A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.
TypeScript
284
4.5 points