MCP-based Image Analysis and Description Server with Cloud and Local Model Integration

Image Recognition MCP

An image recognition server based on the Model Context Protocol that provides image analysis and description functions through OpenAI-compatible vision models, supporting cloud and local model integration.

Image and video processing Artificial intelligence chatbots #Image Recognition #MCP Service #AI Vision #Model Integration .TypeScript

rating : 2 points

downloads : 6.3K

update time : 2025-12-03

Open Site

What is the Image Recognition MCP Server?

This is an intelligent image analysis tool that can identify the content in an image through AI technology and provide detailed text descriptions. It supports multiple vision models, including official OpenAI models and locally deployed models (such as LM Studio, Ollama, etc.), enabling AI assistants to 'understand' pictures.

How to use the Image Recognition MCP Server?

You only need to configure the API key or local model server, and then send the image URL or local file path through simple commands or interfaces. The server will return a detailed description of the image. It can be integrated into various AI assistants that support the MCP protocol, such as Claude Desktop.

Applicable Scenarios

Suitable for various scenarios that require image analysis: content review, image description generation, visual assistance, education and learning, creative design assistance, social media content analysis, etc.

Main Features

Intelligent Image Analysis

Use advanced AI vision models to analyze image content, identify elements such as objects, scenes, text, and people, and provide natural language descriptions.

Multi-Model Support

Supports official OpenAI vision models (such as GPT-4o) and various locally deployed OpenAI-compatible models (such as LM Studio, Ollama, etc.), flexibly adapting to different needs.

MCP Protocol Compatibility

Fully complies with the Model Context Protocol standard and can be seamlessly integrated into AI assistants and applications that support MCP.

Secure File Access

Provides secure local file access control, supports path whitelisting and file type restrictions to protect system security.

Easy-to-Use API

Provides a simple interface design. You only need an image URL or path and optional prompt words to obtain a detailed image description.

Advantages

Supports multiple vision models, including cloud and locally deployed options

Easy to integrate into existing AI assistant workflows

Provides detailed and accurate image descriptions and analysis

Has good security control and access restrictions

Open source and free, can be customized and extended

Limitations

Requires an API key or local model server support

Requires a stable network connection for network images

Analysis of some complex images may not be accurate enough

Local models may require high hardware configuration

How to Use

Installation and Configuration

Ensure that Node.js 18+ is installed, and then add the server configuration to the MCP client configuration. You need to set the OPENAI_API_KEY environment variable (even for local models, a placeholder value is required).

Configure the Model Server

Configure the model according to your needs: use the official OpenAI API or set up a local model server (such as LM Studio, Ollama).

Set Security Options

Configure security options as needed: allowed local file paths, allowed domains, etc., to ensure system security.

Use the Image Analysis Function

Call the describe-image tool through the AI assistant, provide the image URL or local path, and you can obtain the image description.

Usage Examples

Analyze an Online Image

Analyze an image from the Internet to obtain a content description

Analyze a Local Product Image

Analyze a locally stored product image for e-commerce or inventory management

Image Analysis in an Educational Scenario

Analyze images in educational materials to assist learning

Frequently Asked Questions

Do I need an OpenAI API key?

Which image formats are supported?

How to configure a local model server?

What should I do if the server fails to start?

How to ensure the security of local file access?

Which AI assistants are supported?

Related Resources

GitHub Repository

Project source code and latest updates

Model Context Protocol Documentation

Official documentation of the MCP protocol

OpenAI Vision Model Documentation

Guide to using OpenAI vision models

LM Studio Official Website

Local model server LM Studio

Ollama Official Website

Local model server Ollama

🚀 Image Recognition MCP Server

A Model Context Protocol (MCP) server that offers AI-driven image recognition and description capabilities using OpenAI-compatible vision models.

🚀 Quick Start

This MCP server allows AI assistants to analyze and describe images via a straightforward URL-based interface. It supports OpenAI's vision models and OpenAI-compatible local models (like LM Studio, Ollama, etc.), offering detailed image descriptions and facilitating the integration of image analysis capabilities into your AI workflows.

✨ Features

Image Analysis: Analyze images from URLs and obtain detailed descriptions.
Flexible Model Support: Works with OpenAI's vision models and OpenAI-compatible local models (LM Studio, Ollama, etc.).
MCP Protocol: Fully compliant with the Model Context Protocol standard.
TypeScript: Built with TypeScript for type safety and an improved development experience.
Simple API: An easy-to-use interface for image description requests.

📦 Installation

Prerequisites

Node.js 18+
npm or yarn
OpenAI API key or local vision model server (e.g., LM Studio, Ollama)

MCP Client Configuration

To use this server with an MCP client, add the following configuration:

{
  "mcpServers": {
    "image-recognition": {
      "command": "npx",
      "args": ["-y", "@akirose/image-recognition-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-actual-openai-api-key-here"
      }
    }
  }
}

To allow access to image files from any path, set ALLOW_ALL_PATHS to true:

{
  "mcpServers": {
    "image-recognition": {
      "command": "npx",
      "args": ["-y", "@akirose/image-recognition-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-actual-openai-api-key-here",
        "ALLOW_ALL_PATHS": "true"
      }
    }
  }
}

⚠️ Important Note

The env section with your API key is required - this is the only way the MCP server can function. For local models, you can use any placeholder value for OPENAI_API_KEY and configure OPENAI_BASE_URL to point to your local server.

Environment Variables

The server supports the following environment variables:

Property	Details
`OPENAI_API_KEY`	Your OpenAI API key, or any placeholder value when using local models (required)
`OPENAI_BASE_URL`	Base URL for OpenAI API or OpenAI-compatible API servers (optional, defaults to OpenAI's official API). Example for LM Studio: `"http://127.0.0.1:1234/v1"`. Example for Ollama: `"http://localhost:11434/v1"`
`OPENAI_MODEL`	The vision model to use for image recognition (optional, defaults to "gpt-5-mini"). For OpenAI: `"gpt-5-mini"`, `"gpt-4o"`, `"gpt-4o-mini"`, etc. For local models: `"llava"`, `"qwen/qwen3-vl-4b"`, or any locally available vision model
`ALLOWED_IMAGE_PATHS`	Comma-separated list of allowed local file paths (optional, defaults to "./images,./assets"). Example: `"./images,./assets,./downloads"`
`ALLOW_ALL_PATHS`	Set to "true" to allow access to image files from any path. When enabled, only image file extensions (.jpg, .jpeg, .png, .gif, .webp) are allowed for security (optional, defaults to false)
`ALLOWED_DOMAINS`	Comma-separated list of allowed URL domains for enhanced security (optional, defaults to allow all domains). Example: `"example.com,cdn.example.com,images.example.org"`. When not set: All domains are allowed. When set: Only specified domains will be allowed for URL-based image requests

💻 Usage Examples

Basic Usage

`describe-image`

Analyzes an image from a URL or local file path and provides a detailed description.

Parameters:

imageUrl (string): The URL of the image to analyze, or a local file path
prompt (string, optional): The question or prompt to ask about the image (defaults to "what's in this image?")

Example with URL:

{
  "tool": "describe-image",
  "arguments": {
    "imageUrl": "https://example.com/image.jpg",
    "prompt": "what's in this image?"
  }
}

Example with local file:

{
  "tool": "describe-image",
  "arguments": {
    "imageUrl": "./images/my-image.png",
    "prompt": "Describe the objects in this image"
  }
}

Response:

{
  "content": [
    {
      "type": "text",
      "text": "The image shows a beautiful sunset over a mountain landscape with vibrant orange and pink colors in the sky..."
    }
  ]
}

Advanced Usage

This MCP server can be integrated with various AI assistants that support the MCP protocol, such as:

Claude Desktop
Other MCP-compatible AI systems

📚 Documentation

Project Structure

image-recognition-mcp/
├── src/
│   ├── index.ts                # Main server implementation
│   ├── path-validator.ts       # Path validation and security functions
│   └── image-processor.ts      # Image processing utilities
├── test/
│   ├── index.test.ts           # Unit tests
│   ├── describe-image-integration.test.ts  # Integration tests
│   ├── test.png                # Test image
│   └── README.md               # Test documentation
├── dist/                       # Compiled JavaScript output
├── package.json                # Project dependencies and scripts
├── tsconfig.json               # TypeScript configuration
└── README.md                   # This file

Running Tests

The project includes both unit tests and integration tests:

# Run all tests
npm test

# Run unit tests only
npm run test:unit

# Run integration tests with local OpenAI-compatible server
npm run test:integration

Integration Tests Requirements:

A running OpenAI-compatible API server at http://127.0.0.1:1234/v1
The server should support vision models (e.g., qwen/qwen3-vl-4b)
You can use LM Studio, Ollama, or other compatible servers
The integration tests use the OPENAI_BASE_URL and OPENAI_MODEL environment variables

The integration tests will:

Test actual API calls to the vision model
Verify image processing with the test image (test/test.png)
Validate the complete MCP tool workflow with both default and custom prompts
Test error handling and edge cases

Security Features

The server includes several security features:

Path Validation: Restricts local file access to allowed directories
Extension Validation: Only allows specific image file extensions (.jpg, .jpeg, .png, .gif, .webp)
Domain Restriction: Optional URL domain whitelist for enhanced security
File Existence Checks: Validates files exist before processing

Error Handling

The server includes robust error handling for:

Invalid image URLs
Unauthorized file paths or domains
Network connectivity issues
OpenAI API errors
Invalid input parameters
Unsupported file formats

🔧 Troubleshooting

Common Issues

Server fails to start or doesn't work:

✅ Check if OpenAI API key is set: This is the #1 cause of issues
```
echo $OPENAI_API_KEY  # Should show your API key
```
✅ Verify API key is valid: Test with OpenAI's API directly
✅ Check API key has sufficient credits: Ensure your OpenAI account has available credits

"Authentication failed" errors: