🚀 VideoCutter User Guide
VideoCutter is a professional multimedia processing tool that integrates video, audio, and image processing modules. Relying on advanced AI technology and a powerful engine, it deeply supports the MCP intelligent agent protocol, enabling AI agents to call functions through natural language. With both SSE and HTTP Streamable modes, it provides a one-stop and intelligent editing solution for content creation such as short videos.
✨ Features
- 🎯 One-stop Processing: Integrates video, audio, and image processing modules to meet all media editing needs.
- ⚡ High-performance Processing: Supports hardware acceleration, significantly improving processing speed and efficiency.
- 🤖 AI Intelligent Optimization: Built-in multiple AI models, providing intelligent text generation, image generation, and video generation capabilities.
- 🎨 AI Creation Tools: Supports AI creation functions such as text-to-image, text-to-video, and image-to-video conversion.
- 🧠 Intelligent Processing: AI-assisted functions such as intelligent speed change, intelligent scene detection, voice recognition, and subtitle extraction.
- 🤖 MCP Intelligent Agent: Deeply supports AI agents, providing natural language call and intelligent workflow capabilities, supporting both SSE and HTTP Streamable modes.
- 🔌 Multiple Interface Support: Provides REST API and MCP protocol, supporting various integration methods.
- 📱 Cross-platform Compatibility: Supports mainstream operating systems such as Windows, macOS, and Linux.
- 🎯 Precise Positioning: Supports a 81-grid precise positioning system, providing pixel-level precise control.
- 📦 Batch Processing: Supports efficient batch operations such as batch image overlay and text overlay.
Whether you are an individual creator, a content production team, or an enterprise user, you can easily complete complex media processing tasks with VideoCutter.
📞 Contact the Author
Provide one-stop deployment, installation, and activation services
GitHub: https://github.com/daimaxiuligong/VideoCutter
Gitee: https://gitee.com/daimaxiuligong/VideoCutter
🤖 MCP Intelligent Agent Support
VideoCutter deeply integrates the Model Context Protocol (MCP), providing powerful media processing capabilities for AI agents.
MCP Transmission Modes
- SSE Mode: Server-Sent Events mode, supporting real-time streaming data transmission.
- Server address:
http://localhost:8000/mcp/sse
- Features: One-way real-time push, suitable for progress monitoring and status updates.
- Application scenarios: Real-time feedback for long-term processing tasks.
- HTTP Streamable Mode: HTTP streaming mode, supporting two-way streaming communication.
- Server address:
http://localhost:8001/mcp/streamable
- Features: Two-way streaming communication, supporting real-time interaction.
- Application scenarios: Complex workflows that require real-time interaction.
AI Agent Capabilities
Through the MCP protocol, AI agents can:
- Natural Language Call: Use natural language to describe requirements, and AI agents automatically call corresponding media processing functions.
- Intelligent Workflow: AI agents can combine multiple processing steps to create complex media processing workflows.
- Real-time Collaboration: Support real-time collaboration between AI agents and users, adjusting processing strategies based on user feedback.
- Context Understanding: AI agents can understand the context of media content and provide more accurate processing suggestions.
- Automated Creation: From content planning to final output, AI agents can handle the entire process automatically.
- Streamed Response: Support real-time progress feedback and result streaming, enhancing the user experience.
🔗 Multi-interface Ecosystem
VideoCutter has built a complete multi-interface ecosystem:
- REST API: Provides standardized HTTP interfaces for traditional applications and web services.
- MCP Protocol: Provides specialized protocol support for AI agents and AI applications.
- SSE Mode: Real-time streaming data transmission, suitable for progress monitoring.
- HTTP Streamable Mode: Two-way streaming communication, supporting real-time interaction.
- Local Deployment: Supports local deployment of models, protecting data privacy.
- Cloud Service: Supports cloud AI services such as Doubao and Silicon-based Flow, providing powerful computing power.
- Plugin Extension: Supports third-party plugins and custom function extensions.
💡 Product Highlights
1. Powerful Video Processing Capabilities
Basic Editing Functions
- Video Splitting: Precise video splitting down to milliseconds, supporting specified time ranges.
- Video Merging: Intelligent merging of multiple video files, automatically handling format compatibility.
- Video Speed Change: 0.1 - 16x speed adjustment, maintaining audio-video synchronization.
- Video Reverse Playback: Complete reverse playback of the timeline.
- Video Rotation: Rotation at any angle, automatically adjusting the output size.
- Video Cropping: Precise pixel-level area cropping.
- Video Scaling: Intelligent size adjustment, maintaining the aspect ratio.
- Video Padding: Adding borders and padding effects to videos.
Video Special Effects Functions
- Video Filters: Various artistic filters such as black and white, sepia, vintage, and blur.
- Color Adjustment: Fine adjustment of brightness, contrast, saturation, and gamma value.
- Video Sharpening: Enhancing image details and clarity.
- Mosaic Processing: Adding mosaic effects to specified areas.
- Intelligent Speed Change: Intelligent accelerated playback based on content similarity.
- Intelligent Scene Detection: Automatically identifying video scene change points.
Overlay and Composition Functions
- Video Overlay: Overlaying another video on the main video.
- Image Overlay: Overlaying static images on videos, supporting 81-grid precise positioning.
- Text Overlay: Adding text watermarks and subtitles to videos, supporting multiple fonts and effects.
- Audio Overlay: Overlaying audio tracks on videos.
- Audio-Video Separation: Separating audio and video tracks in videos.
- Batch Overlay: Supporting batch addition of image and text watermarks through command files.
Format Conversion Functions
- Video to GIF: Converting videos to GIF animations.
- Video Frame Extraction: Extracting single frames from specified time points.
2. Professional Audio Processing
Basic Audio Editing
- Audio Splitting: Millisecond-level audio splitting, supporting specified time ranges.
- Audio Merging: Seamless merging of multiple audio files.
- Audio Speed Change: Speed change processing while maintaining audio quality.
- Audio Reverse Playback: Complete reverse playback of the timeline.
- Volume Adjustment: Precise volume control and standardization.
Audio Enhancement Effects
- Audio Standardization: Standardizing audio volume to a standard level.
- Fade In/Out: Adding smooth fade-in and fade-out effects to audio.
- Reverb Effect: Simulating acoustic effects of different spatial environments.
- Audio Compressor: Professional-level dynamic range compression.
- Voice Enhancement: Highlighting voices and improving clarity.
- Audio Mixing: Mixing multiple audio tracks into a single track.
Advanced Functions
- Audio Looping: Creating looped audio.
- Audio Format Conversion: Supporting mutual conversion of all mainstream audio formats.
- Subtitle Extraction: Automatically extracting subtitle text from audio.
- Text-to-Speech: Supporting CosyVoice pre-training and voice cloning modes.
- Audio Information Retrieval: Obtaining detailed information about audio files.
3. Comprehensive Image Processing
Basic Image Editing
- Image Cropping: Precise pixel-level cropping control.
- Image Rotation: Rotation at any angle and mirror flipping.
- Image Scaling: Intelligent size adjustment while maintaining the aspect ratio.
- Image Flipping: Horizontal and vertical flipping.
- Brightness Adjustment: Precise brightness control.
- Contrast Adjustment: Enhancing the difference between light and dark.
- Saturation Adjustment: Controlling color saturation.
Image Special Effects
- Image Filters: Various effects such as black and white, vintage, blur, and sharpening.
- Noise Effect: Adding various types of noise effects.
- Vignette Effect: Creating a professional photography atmosphere.
- Image Sharpening: Enhancing image details and clarity.
- Mosaic Processing: Adding mosaic effects to specified areas.
Image Composition
- Image Overlay (Absolute Position): Overlaying images at specified coordinate positions.
- Image Overlay (Relative Position): Overlaying images using relative positions, supporting 81-grid precise positioning.
- Text Overlay (Absolute Position): Adding text at specified coordinate positions.
- Text Overlay (Relative Position): Adding text using relative positions, supporting multiple fonts and effects.
- Collage Creation: Multi-image collages and grid layouts.
- Batch Overlay: Supporting batch overlay of images and text through command files, improving processing efficiency.
Format Conversion Functions
- Image to Video: Converting static images to videos.
- Multiple Images to GIF: Combining multiple images into GIF animations.
- Image Format Conversion: Supporting mutual conversion of all mainstream image formats.
- Watermark Removal: Intelligent removal of watermarks from images.
- Beauty Enhancement: Simple face beauty effects.
- Image Thumbnail Generation: Generating thumbnails of specified sizes.
- Image Information Retrieval: Obtaining detailed information about image files.
4. Powerful AI Intelligent Functions
AI Model Services
- Multi-model Support: Integrates mainstream AI service providers such as Ollama, Doubao, and Silicon-based Flow.
- Local Deployment: Supports local deployment of Ollama models, protecting data privacy.
- Cloud Service: Supports cloud AI services such as Doubao and Silicon-based Flow, providing powerful computing power.
- Flexible Configuration: Can enable or disable different AI service providers according to needs.
- MCP Integration: Provides 67 professional tools for AI agents through the MCP protocol.
Text Generation Functions
- Intelligent Text Generation: Generate high-quality text content based on prompt words.
- Multi-language Support: Support text generation in multiple languages such as Chinese and English.
- Parameter Adjustment: Support fine adjustment of parameters such as temperature and maximum length.
- Segmented Content Generation: Automatically generate video segment descriptions and corresponding subtitle text.
Image Generation Functions
- Text-to-Image: Generate high-quality images based on text descriptions.
- Multi-resolution Support: Support various resolutions from 512x512 to 2048x2048.
- Multi-aspect Ratio Support: Support various aspect ratios such as 1:1, 4:3, 16:9, and 9:16.
- Artistic Styles: Support multiple artistic and creative styles.
Video Generation Functions
- Text-to-Video: Generate dynamic video content based on text descriptions.
- Image-to-Video: Convert static images to dynamic videos.
- Multi-resolution Support: Support various resolutions such as 480p, 720p, and 1080p.
- Duration Control: Support video duration adjustment from 3 to 12 seconds.
- Action Description: Control actions and changes in videos through text descriptions.
Intelligent Processing Functions
- Intelligent Speed Change: Automatically detect and accelerate repeated segments based on content similarity.
- Intelligent Scene Detection: Automatically identify video scene change points for precise editing.
- Voice Recognition: Automatically extract subtitle text from videos.
- Voice Enhancement: Intelligent enhancement of the voice part in audio.
- Audio Noise Reduction: Automatically remove background noise from audio.
AI-assisted Creation
- Content Planning: AI helps plan the structure and segments of video content.
- Subtitle Generation: Automatically generate subtitle text that matches video content.
- Creative Suggestions: Provide creative inspiration and suggestions based on themes.
- Quality Optimization: AI-assisted optimization of video, audio, and image quality.
MCP Intelligent Agent Integration
- Natural Language Interaction: Interact with AI agents through natural language to complete complex media processing tasks.
- Intelligent Workflow: AI agents can automatically combine multiple processing steps to create end-to-end processing flows.
- Context Awareness: AI agents can understand the context of media content and provide more accurate processing suggestions.
- Real-time Collaboration: Support real-time collaboration between AI agents and users, adjusting processing strategies based on feedback.
- Automated Creation: From creative conception to final output, AI agents can handle the entire process automatically.
- Toolchain Integration: AI agents can call all 67 professional tools of VideoCutter to achieve complex tasks.
5. Efficient Batch Processing Functions
Batch Image Overlay
- Command File Support: Define batch overlay commands through TXT files.
- Flexible Command Format: Support two command formats: image overlay and text overlay.
- Parameterized Configuration: Support custom configuration of parameters such as position, transparency, scaling, and font.
- Intelligent Command Recognition: Automatically recognize image and text commands without manual specification.
- Batch Execution: Process multiple overlay operations at once, significantly improving efficiency.
Batch Text Overlay
- Multi-font Support: Support system fonts and custom font files.
- Rich Text Effects: Support various text effects such as shadows, strokes, and glows.
- Precise Positioning: Support 81-grid precise positioning system for pixel-level precise control.
- Parameterized Configuration: Support custom configuration of parameters such as font size, color, and transparency.
Advantages of Batch Processing
- Efficient Processing: Batch operations are several times more efficient than single operations.
- Command Reuse: Command files can be saved and reused.
- Error Handling: A single command failure does not affect the overall processing flow.
- Flexible Configuration: Support default values and parameter overrides to adapt to different scenario requirements.
🔌 Interface Integration Guide
1. REST API Interface
VideoCutter provides a complete REST API, supporting direct call of various processing functions through HTTP requests.
Service Information
- API Service Address:
http://localhost:8900
- Interactive Documentation:
http://localhost:8900/docs
- ReDoc Documentation:
http://localhost:8900/redoc
- Health Check:
http://localhost:8900/health
Interface Features
- Standardized Design: Follows RESTful API design specifications.
- Unified Response Format: All interfaces return a unified JSON format.
- File Upload Support: Supports multipart/form-data file uploads.
- Parameter Validation: Complete request parameter validation and error handling.
- AI Model Integration: Built-in AI model APIs, supporting text generation, image generation, and video generation.
AI Model APIs
- Text Generation: Supports text generation models such as Ollama, Doubao, and Silicon-based Flow.
- Image Generation: Supports text-to-image generation, with multiple resolutions and artistic styles.
- Video Generation: Supports text-to-video and image-to-video generation.
- Segmented Content Generation: AI-assisted generation of video segment descriptions and subtitles.
Detailed Documentation
For the complete API interface documentation, please refer to: VideoCutter_API User Guide.md
2. MCP Protocol Interface
The Model Context Protocol (MCP) allows AI models to directly call various functions of VideoCutter. It supports two transmission modes to meet different application scenario requirements.
Service Information
- SSE Server:
http://localhost:8000/mcp/sse
- HTTP Streamable Server:
http://localhost:8001/mcp/streamable
Transmission Mode Features
SSE Mode (Server-Sent Events)
- Features: One-way real-time push, where the server actively sends data to the client.
- Advantages: Low latency, easy to use, suitable for progress monitoring.
- Application Scenarios: Real-time feedback for long-term processing tasks, status updates.
- Technical Features: Based on HTTP long connections, with an automatic reconnection mechanism.
HTTP Streamable Mode
- Features: Two-way streaming communication, supporting real-time interaction between the client and the server.
- Advantages: Supports complex interactions, real-time collaboration, and dynamic adjustment.
- Application Scenarios: Complex workflows that require real-time interaction, AI agent collaboration.
- Technical Features: Based on HTTP/2 streaming, supporting concurrent processing.
Protocol Features
- AI-friendly: Designed specifically for AI model integration, supporting natural language call.
- Streamed Response: Supports real-time progress feedback and result streaming.
- Dual Transmission Modes: Supports both SSE and HTTP Streamable transmission modes.
- Rich Tools: Provides 67 professional media processing tools.
- AI Tool Integration: Built-in AI model call tools, supporting text, image, and video generation.
AI Tool Support
- Text Generation Tools: Support text generation functions of multiple AI models.
- Image Generation Tools: Support text-to-image and image processing functions.
- Video Generation Tools: Support text-to-video and image-to-video functions.
- Intelligent Processing Tools: Support AI-assisted functions such as intelligent speed change and scene detection.
Detailed Documentation
For the complete MCP tool documentation, please refer to: VideoCutter_MCP User Guide.md
📚 Documentation Resources
- API User Guide: VideoCutter_API User Guide.md - Detailed REST API interface description.
- MCP User Guide: VideoCutter_MCP User Guide.md - Complete MCP tool usage guide.
- AI Model Usage Instructions: AI Model Usage Instructions.md - AI function configuration and usage guide.
- Position Parameter Details: VideoCutter_Position Position Parameter Details.md - Detailed description of the 81-grid positioning system.
- User Guide: This document - Product introduction and integration guide.