GPT - Image - MCP: Server for Integrating Multi - model Image Generation in LLM Chatbots via MCP Protocol

Gpt Image MCP

The Image Gen MCP Server is a general AI image generation service that provides cross - platform and multi - model image generation capabilities for various LLM chatbots through the Model Context Protocol (MCP) standard protocol. It supports multiple image models from OpenAI and Google, enabling seamless conversion from text conversations to visual content.

Image and video processing Artificial intelligence chatbots #Image generation #Cross - platform #AI integration #MCP protocol .Python

rating : 2.5 points

downloads : 1

update time : 2025-07-24

What is the Image Gen MCP Server?

This is a server based on the Model Context Protocol (MCP) that allows any AI chatbot client supporting MCP to generate high-quality images. Whether you are using Claude Desktop, a custom ChatGPT interface, or a Llama application, you can access multiple AI image generation models through this server.

How to use the Image Gen MCP Server?

Simply configure the API key and start the server to integrate with various AI chatbot clients via the MCP protocol. You can operate through the command line or graphical interface to easily generate and edit images.

Applicable scenarios

It is applicable to multiple fields such as content creation, development and design, enterprise integration, and the creative industry. Whether you are a blogger, a social media manager, a UI designer, or a game developer, you can improve work efficiency through this server.

Main features

Multi-platform supportCompatible with all AI chatbot clients supporting the MCP protocol, such as Claude Desktop, Continue.dev, etc.

Multi-model supportSupports multiple image generation models from OpenAI and Google Gemini, including gpt-image-1, dall-e-3, imagen-4, etc.

Image editing functionNot only can it generate images, but it can also edit existing images through text instructions.

Multiple output formatsSupports output in multiple image formats such as PNG, JPEG, and WebP.

Intelligent cachingProvides memory and Redis cache support to ensure efficient operation.

Advantages and limitations

Advantages

Achieve seamless integration of text and images without switching tools

Avoid vendor lock-in and improve workflow efficiency

Support multiple AI image generation models to meet different needs

Provide a unified API interface to simplify the integration process

Limitations

Requires API key configuration, which may be difficult for beginners

Relies on MCP protocol support and is currently limited to some clients

The cost of image generation may be relatively high

How to use

Clone the repository

Clone the Image Gen MCP Server code repository from GitHub.

Install dependencies

Use the UV package manager to install all necessary Python dependencies.

Configure the environment

Copy the example environment file and add your OpenAI and Google API keys.

Start the server

Select an appropriate transport method (STDIO, HTTP, or SSE) to start the server according to your needs.

Usage examples

Social media marketingCreate customized visual content for social media posts without leaving the chat interface.

Educational material productionQuickly generate teaching materials and visual aids during the teaching process.

Game developmentProvide rapid prototyping for game concept art and asset ideation.

Frequently Asked Questions

How to obtain an API key?

Which image formats are supported?

How to select different image generation models?

Is image editing supported?

Related resources

Official documentation

Contains complete project documentation and usage guides

VPS deployment guide

Details how to deploy the server on a VPS

Docker configuration

Docker configuration file for the production environment

Frequently Asked Questions

Guide to solving common problems

🚀 Image Gen MCP Server

Empowering Universal Image Generation for AI Chatbots

Traditional AI chatbot interfaces are limited to text-only interactions, regardless of how powerful their underlying language models are. Image Gen MCP Server bridges this gap by enabling any LLM-powered chatbot client to generate professional-quality images through the standardized Model Context Protocol (MCP).

Whether you're using Claude Desktop, a custom ChatGPT interface, Llama-based applications, or any other LLM client that supports MCP, this server democratizes access to multiple AI image generation models including OpenAI's gpt-image-1, dall-e-3, dall-e-2, and Google's Imagen series (imagen-4, imagen-4-ultra, imagen-3), transforming text-only conversations into rich, visual experiences.

⚠️ Important Note

This project uses UV for fast, reliable Python package management. UV provides better dependency resolution, faster installs, and proper environment isolation compared to traditional pip/venv workflows.

🚀 Quick Start

Prerequisites

Python 3.10+
UV package manager
OpenAI API key (for OpenAI models)
Google Gemini API key (for Gemini models, optional)

Installation

Clone and setup:
```
git clone <repository-url>
cd image-gen-mcp
uv sync
```
💡 Usage Tip

This project uses UV for fast, reliable Python package management. UV provides better dependency resolution and faster installs compared to pip.

Configure environment:

cp .env.example .env
# Edit .env and add your API keys:
# - PROVIDERS__OPENAI__API_KEY for OpenAI models
# - PROVIDERS__GEMINI__API_KEY for Gemini models (optional)

Test the setup:

uv run python scripts/dev.py setup
uv run python scripts/dev.py test

Running the Server

Development Mode

# HTTP transport for web development and testing
./run.sh dev

# HTTP transport with development tools (Redis Commander)
./run.sh dev --tools

# STDIO transport for Claude Desktop integration  
./run.sh stdio

# Production deployment with monitoring
./run.sh prod

Manual Execution

# STDIO transport (default) - for Claude Desktop
uv run python -m gpt_image_mcp.server

# HTTP transport - for web deployment
uv run python -m gpt_image_mcp.server --transport streamable-http --port 3001

# SSE transport - for real-time applications
uv run python -m gpt_image_mcp.server --transport sse --port 8080

# With custom configuration
uv run python -m gpt_image_mcp.server --config /path/to/.env --log-level DEBUG

# Enable CORS for web development
uv run python -m gpt_image_mcp.server --transport streamable-http --cors

Command Line Options

uv run python -m gpt_image_mcp.server --help

Image Gen MCP Server - Generate and edit images using OpenAI's gpt-image-1 model

options:
  --config PATH         Path to configuration file (.env format)
  --log-level LEVEL     Set logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
  --transport TYPE      Transport method (stdio, sse, streamable-http)
  --port PORT          Port for HTTP transports (default: 3001)
  --host HOST          Host address for HTTP transports (default: 127.0.0.1)
  --cors               Enable CORS for web deployments
  --version            Show version information
  --help               Show help message

Examples:
  # Claude Desktop integration
  uv run python -m gpt_image_mcp.server

  # Web deployment with Redis cache
  uv run python -m gpt_image_mcp.server --transport streamable-http --port 3001

  # Development with debug logging and tools
  uv run python -m gpt_image_mcp.server --log-level DEBUG --cors

MCP Client Integration

This server works with any MCP-compatible chatbot client. Here are configuration examples:

Claude Desktop (Anthropic)

{
  "mcpServers": {
    "image-gen-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/image-gen-mcp",
        "run",
        "image-gen-mcp"
      ],
      "env": {
        "OPENAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Continue.dev (VS Code Extension)

{
  "mcpServers": {
    "gpt-image": {
      "command": "uv",
      "args": ["--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Custom MCP Clients

For other MCP-compatible applications, use the standard MCP STDIO transport:

uv run python -m gpt_image_mcp.server

💡 Usage Tip

This server follows the standard MCP protocol, ensuring compatibility with current and future MCP-enabled clients across the AI ecosystem.

✨ Features

🌐 Universal Compatibility: Works with any MCP-enabled LLM client
🔄 Seamless Integration: No context switching or workflow interruption
⚡ Standardized Protocol: One server, multiple client support
🎨 Multi-Provider Support: Access to OpenAI and Google's latest image generation models
🔧 Unified Interface: Single API for multiple AI providers with automatic model discovery

🎨 Multi-Provider Image Generation

Multiple AI Models: Support for OpenAI (gpt-image-1, dall-e-3, dall-e-2) and Google Gemini (imagen-4, imagen-4-ultra, imagen-3)
Text-to-Image: Generate high-quality images from text descriptions
Image Editing: Edit existing images with text instructions (OpenAI models)
Multiple Formats: Support for PNG, JPEG, and WebP output formats
Quality Control: Auto, high, medium, and low quality settings
Background Control: Transparent, opaque, or auto background options
Dynamic Model Discovery: Query available models and capabilities at runtime

🔗 MCP Integration

FastMCP Framework: Built with the latest MCP Python SDK
Multiple Transports: STDIO, HTTP, and SSE transport support
Structured Output: Validated tool responses with proper schemas
Resource Access: MCP resources for image retrieval and management
Prompt Templates: 10+ built-in templates for common use cases

💾 Storage & Caching

Local Storage: Organized directory structure with metadata
URL-based Access: Transport-aware URL generation for images
Dual Access: Immediate base64 data + persistent resource URIs
Smart Caching: Memory-based caching with TTL and Redis support
Auto Cleanup: Configurable file retention policies

🚀 Production Deployment

Docker Support: Production-ready Docker containers
Multi-Transport: STDIO for Claude Desktop, HTTP for web deployment
Reverse Proxy: Nginx configuration with rate limiting
Monitoring: Grafana and Prometheus integration
SSL/TLS: Automatic certificate management with Certbot

🛠️ Development Features

Type Safety: Full type hints with Pydantic models
Error Handling: Comprehensive error handling and logging
Configuration: Environment-based configuration management
Testing: Pytest-based test suite with async support
Dev Tools: Hot reload, Redis Commander, debug logging

📦 Installation

The installation steps are included in the "Quick Start" section above.

💻 Usage Examples

Basic Usage

# Use via MCP client
result = await session.call_tool(
    "generate_image",
    arguments={
        "prompt": "A beautiful sunset over mountains, digital art style",
        "quality": "high",
        "size": "1536x1024",
        "style": "vivid"
    }
)

Advanced Usage

# Get optimized prompt for social media
prompt_result = await session.get_prompt(
    "social_media_prompt",
    arguments={
        "platform": "instagram",
        "content_type": "product announcement",
        "brand_style": "modern minimalist"
    }
)

# Access via resource URI
image_data = await session.read_resource("generated-images://img_20250630143022_abc123")

# Check recent images
history = await session.read_resource("image-history://recent?limit=5")

# Storage statistics
stats = await session.read_resource("storage-stats://overview")

📚 Documentation

Available Tools

`list_available_models`

List all available image generation models and their capabilities.

Returns: Dictionary with model information, capabilities, and provider details.

`generate_image`

Generate images from text descriptions using any supported model.

Parameters:

prompt (required): Text description of desired image
model (optional): Model to use (e.g., "gpt-image-1", "dall-e-3", "imagen-4")
quality: "auto" | "high" | "medium" | "low" (default: "auto")
size: "1024x1024" | "1536x1024" | "1024x1536" (default: "1536x1024")
style: "vivid" | "natural" (default: "vivid")
output_format: "png" | "jpeg" | "webp" (default: "png")
background: "auto" | "transparent" | "opaque" (default: "auto")

Note: Parameter availability depends on the selected model. Use list_available_models to check capabilities.

`edit_image`

Edit existing images with text instructions.

Parameters:

image_data (required): Base64 encoded image or data URL
prompt (required): Edit instructions
mask_data: Optional mask for targeted editing
size, quality, output_format: Same as generate_image

Available Resources

generated-images://{image_id} - Access specific generated images
image-history://recent - Browse recent generation history
storage-stats://overview - Storage usage and statistics
model-info://gpt-image-1 - Model capabilities and pricing

Prompt Templates

Built-in templates for common use cases:

Creative Image: Artistic image generation
Product Photography: Commercial product images
Social Media Graphics: Platform-optimized posts
Blog Headers: Article header images
OG Images: Social media preview images
Hero Banners: Website hero sections
Email Headers: Newsletter headers
Video Thumbnails: YouTube/video thumbnails
Infographics: Data visualization images
Artistic Style: Specific art movement styles

Configuration

Configure via environment variables or .env file:

# =============================================================================
# Provider Configuration
# =============================================================================
# OpenAI Provider (default enabled)
PROVIDERS__OPENAI__API_KEY=sk-your-openai-api-key-here
PROVIDERS__OPENAI__BASE_URL=https://api.openai.com/v1
PROVIDERS__OPENAI__ORGANIZATION=org-your-org-id
PROVIDERS__OPENAI__TIMEOUT=300.0
PROVIDERS__OPENAI__MAX_RETRIES=3
PROVIDERS__OPENAI__ENABLED=true

# Gemini Provider (default disabled)
PROVIDERS__GEMINI__API_KEY=your-gemini-api-key-here
PROVIDERS__GEMINI__BASE_URL=https://generativelanguage.googleapis.com/v1beta/
PROVIDERS__GEMINI__TIMEOUT=300.0
PROVIDERS__GEMINI__MAX_RETRIES=3
PROVIDERS__GEMINI__ENABLED=false
PROVIDERS__GEMINI__DEFAULT_MODEL=imagen-4

# =============================================================================
# Image Generation Settings
# =============================================================================
IMAGES__DEFAULT_MODEL=gpt-image-1
IMAGES__DEFAULT_QUALITY=auto
IMAGES__DEFAULT_SIZE=1536x1024
IMAGES__DEFAULT_STYLE=vivid
IMAGES__DEFAULT_MODERATION=auto
IMAGES__DEFAULT_OUTPUT_FORMAT=png
# Base URL for image hosting (e.g., https://cdn.example.com for nginx/CDN)
IMAGES__BASE_HOST=

# =============================================================================
# Server Configuration
# =============================================================================
SERVER__NAME=Image Gen MCP Server
SERVER__VERSION=0.1.0
SERVER__PORT=3001
SERVER__HOST=127.0.0.1
SERVER__LOG_LEVEL=INFO
SERVER__RATE_LIMIT_RPM=50

# =============================================================================
# Storage Configuration
# =============================================================================
STORAGE__BASE_PATH=./storage
STORAGE__RETENTION_DAYS=30
STORAGE__MAX_SIZE_GB=10.0
STORAGE__CLEANUP_INTERVAL_HOURS=24

# =============================================================================
# Cache Configuration
# =============================================================================
CACHE__ENABLED=true
CACHE__TTL_HOURS=24
CACHE__BACKEND=memory
CACHE__MAX_SIZE_MB=500
# CACHE__REDIS_URL=redis://localhost:6379

🔧 Technical Details

The server follows a modular, production-ready architecture:

Core Components

Server Layer (server.py): FastMCP-based MCP server with multi-transport support
Configuration (config/): Environment-based settings management with validation
Tool Layer (tools/): Image generation and editing capabilities
Resource Layer (resources/): MCP resources for data access and model registry
Storage Manager (storage/): Organized local image storage with cleanup
Cache Manager (utils/cache.py): Memory and Redis-based caching system

Multi-Provider Architecture

Provider Registry (providers/registry.py): Centralized provider and model management
Provider Base (providers/base.py): Abstract base class for all providers
OpenAI Provider (providers/openai.py): OpenAI API integration with retry logic
Gemini Provider (providers/gemini.py): Google Gemini API integration
Type System (types/): Pydantic models for type safety
Validation (utils/validators.py): Input validation and sanitization

Infrastructure

Prompt Templates (prompts/): Template system for optimized prompts
Dynamic Model Discovery: Runtime model capability detection
Parameter Translation: Automatic parameter mapping between providers

Deployment

Docker Support: Development and production containers
Multi-Transport: STDIO, HTTP, SSE transport layers
Monitoring: Prometheus metrics and Grafana dashboards
Reverse Proxy: Nginx configuration with SSL and rate limiting

📄 License

MIT License - see LICENSE file for details.

🖼️ Visual Showcase

Real-World Usage

Claude Desktop with Image Gen MCP Claude Desktop seamlessly generating images through MCP integration

Generated Examples

*High-quality images generated through the MCP server, demonstrating professional-grade output*

📋 Use Cases & Applications

🎯 Content Creation Workflows

Bloggers & Writers: Generate custom illustrations directly in writing tools
Social Media Managers: Create platform-specific graphics without leaving chat interfaces
Marketing Teams: Rapid prototyping of visual concepts during brainstorming sessions
Educators: Generate teaching materials and visual aids on-demand

🚀 Development & Design

UI/UX Designers: Quick mockup generation during design discussions
Frontend Developers: Placeholder and concept images within development environments
Technical Writers: Custom diagrams and illustrations for documentation
Product Managers: Visual concept communication in any LLM-powered tool

🏢 Enterprise Integration

Customer Support: Generate visual explanations and guides
Sales Teams: Custom presentation materials tailored to client needs
Training Programs: Visual learning materials created in conversational interfaces
Internal Tools: Add image generation to existing LLM-powered applications

🎨 Creative Industries

Game Developers: Concept art and asset ideation
Film & Media: Storyboard and concept visualization
Architecture: Quick visual references and mood boards
Advertising: Campaign concept development

💰 Cost Estimation

Text Input: ~$5 per 1M tokens
Image Output: ~$40 per 1M tokens (~1750 tokens per image)
Typical Cost: ~$0.07 per image generation

🛡️ Error Handling

Comprehensive error handling includes:

API rate limiting and retries
Invalid parameter validation
Storage error recovery
Cache failure fallbacks
Detailed error logging

🔐 Security

Security features include:

OpenAI API key protection
Input validation and sanitization
File system access controls
Rate limiting protection
No credential exposure in logs

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request

🆘 Support

For issues and questions:

Check the troubleshooting guide
Review common issues
Open an issue on GitHub

🚢 Deployment

Production Deployment

The server supports production deployment with Docker, monitoring, and reverse proxy:

# Quick production deployment
./run.sh prod

# Manual Docker Compose deployment
docker-compose -f docker-compose.prod.yml up -d

Production Stack includes:

Image Gen MCP Server: Main application container
Redis: Caching and session storage
Nginx: Reverse proxy with rate limiting (configured separately)
Prometheus: Metrics collection
Grafana: Monitoring dashboards

Access Points:

Main Service: http://localhost:3001 (behind proxy)
Grafana Dashboard: http://localhost:3000
Prometheus: http://localhost:9090 (localhost only)

VPS Deployment

For VPS deployment with SSL, monitoring, and production hardening:

# Download deployment script
wget https://raw.githubusercontent.com/your-repo/image-gen-mcp/main/deploy/vps-setup.sh
chmod +x vps-setup.sh
./vps-setup.sh

Features included:

Docker containerization
Nginx reverse proxy with SSL
Automatic certificate management (Certbot)
System monitoring and logging
Firewall configuration
Automatic backups

See VPS Deployment Guide for detailed instructions.

Docker Configuration

Available Docker Compose profiles:

# Development with HTTP transport
docker-compose -f docker-compose.dev.yml up

# Development with Redis Commander
docker-compose -f docker-compose.dev.yml --profile tools up

# STDIO transport for desktop integration
docker-compose -f docker-compose.dev.yml --profile stdio up

# Production with monitoring
docker-compose -f docker-compose.prod.yml up -d

🔮 The Future of AI Integration

The Model Context Protocol represents a paradigm shift towards standardized AI tool integration. As more LLM clients adopt MCP support, servers like this one become increasingly valuable by providing universal capabilities across the entire ecosystem.