Csv Editor

The CSV Editor is an AI - driven MCP server that provides powerful CSV data processing capabilities for AI assistants, including data cleaning, transformation, analysis, and validation, and supports auto - save and history tracking.

Research and data Developer tools #CSV processing #Data analysis #AI assistant #Data cleaning .Python

rating : 2.5 points

downloads : 0

update time : 2025-09-09

Open Site

Installation

Copy the following command to your Client for configuration

{
  "mcpServers": {
    "csv-editor": {
      "command": "uv",
      "args": ["tool", "run", "csv-editor"],
      "env": {
        "CSV_MAX_FILE_SIZE": "1073741824"
      }
    }
  }
}

Note: Your key is sensitive information, do not share it with anyone.

🚀 CSV Editor - AI-Powered CSV Processing via MCP

Transform how AI assistants work with CSV data. CSV Editor is a high-performance MCP server that empowers Claude, ChatGPT, and other AI assistants with powerful data manipulation capabilities through simple commands.

🚀 Quick Start

Installing via Smithery

To automatically install csv-editor for Claude Desktop via Smithery:

npx -y @smithery/cli install @santoshray02/csv-editor --client claude

Fastest Installation (Recommended)

# Install uv if needed (one-time setup)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and run
git clone https://github.com/santoshray02/csv-editor.git
cd csv-editor
uv sync
uv run csv-editor

Configure Your AI Assistant

Claude Desktop (Click to expand)

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "csv-editor": {
      "command": "uv",
      "args": ["tool", "run", "csv-editor"],
      "env": {
        "CSV_MAX_FILE_SIZE": "1073741824"
      }
    }
  }
}

Other Clients (Continue, Cline, Windsurf, Zed)

See MCP_CONFIG.md for detailed configuration.

✨ Features

The Problem

AI assistants struggle with complex data operations - they can read files but lack tools for filtering, transforming, analyzing, and validating CSV data efficiently.

The Solution

CSV Editor bridges this gap by providing AI assistants with 40+ specialized tools for CSV operations, turning them into powerful data analysts that can:

Clean messy datasets in seconds
Perform complex statistical analysis
Validate data quality automatically
Transform data with natural language commands
Track all changes with undo/redo capabilities

Key Differentiators

Feature	CSV Editor	Traditional Tools
AI Integration	Native MCP protocol	Manual operations
Auto-Save	Automatic with strategies	Manual save required
History Tracking	Full undo/redo with snapshots	Limited or none
Session Management	Multi-user isolated sessions	Single user
Data Validation	Built-in quality scoring	Separate tools needed
Performance	Handles GB+ files with chunking	Memory limitations

💻 Usage Examples

Basic Usage

# Your AI assistant can now do this:
"Load the sales data and remove duplicates"
"Filter for Q4 2024 transactions over $10,000"  
"Calculate correlation between price and quantity"
"Fill missing values with the median"
"Export as Excel with the analysis"

# All with automatic history tracking and undo capability!

Advanced Usage

📊 Data Analyst Workflow

# Morning: Load yesterday's data
session = load_csv("daily_sales.csv")

# Clean: Remove duplicates and fix types
remove_duplicates(session_id)
change_column_type("date", "datetime")
fill_missing_values(strategy="median", columns=["revenue"])

# Analyze: Get insights
get_statistics(columns=["revenue", "quantity"])
detect_outliers(method="iqr", threshold=1.5)
get_correlation_matrix(min_correlation=0.5)

# Report: Export cleaned data
export_csv(format="excel", file_path="clean_sales.xlsx")

🏭 ETL Pipeline

# Extract from multiple sources
load_csv_from_url("https://api.example.com/data.csv")

# Transform with complex operations
filter_rows(conditions=[
    {"column": "status", "operator": "==", "value": "active"},
    {"column": "amount", "operator": ">", "value": 1000}
])
add_column(name="quarter", formula="Q{(month-1)//3 + 1}")
group_by_aggregate(group_by=["quarter"], aggregations={
    "amount": ["sum", "mean"],
    "customer_id": "count"
})

# Load to different formats
export_csv(format="parquet")  # For data warehouse
export_csv(format="json")     # For API

🔍 Data Quality Assurance

# Validate incoming data
validate_schema(schema={
    "customer_id": {"type": "integer", "required": True},
    "email": {"type": "string", "pattern": r"^[^@]+@[^@]+\.[^@]+$"},
    "age": {"type": "integer", "min": 0, "max": 120}
})

# Quality scoring
quality_report = check_data_quality()
# Returns: overall_score, missing_data%, duplicates, outliers

# Anomaly detection
anomalies = find_anomalies(methods=["statistical", "pattern"])

📚 Documentation

Available Tools

Complete Tool List (40+ tools)

I/O Operations

load_csv - Load from file
load_csv_from_url - Load from URL
load_csv_from_content - Load from string
export_csv - Export to various formats
get_session_info - Session details
list_sessions - Active sessions
close_session - Cleanup

Data Manipulation

filter_rows - Complex filtering
sort_data - Multi-column sort
select_columns - Column selection
rename_columns - Rename columns
add_column - Add computed columns
remove_columns - Remove columns
update_column - Update values
change_column_type - Type conversion
fill_missing_values - Handle nulls
remove_duplicates - Deduplicate

Analysis

get_statistics - Statistical summary
get_column_statistics - Column stats
get_correlation_matrix - Correlations
group_by_aggregate - Group operations
get_value_counts - Frequency counts
detect_outliers - Find outliers
profile_data - Data profiling

Validation

validate_schema - Schema validation
check_data_quality - Quality metrics
find_anomalies - Anomaly detection

Auto-Save & History

configure_auto_save - Setup auto-save
get_auto_save_status - Check status
undo / redo - Navigate history
get_history - View operations
restore_to_operation - Time travel

Configuration

Environment Variables

Property	Details
`CSV_MAX_FILE_SIZE`	1GB. Maximum file size
`CSV_SESSION_TIMEOUT`	3600s. Session timeout
`CSV_CHUNK_SIZE`	10000. Processing chunk size
`CSV_AUTO_SAVE`	true. Enable auto-save

Auto-Save Strategies

CSV Editor automatically saves your work with configurable strategies:

Overwrite (default) - Update original file
Backup - Create timestamped backups
Versioned - Maintain version history
Custom - Save to specified location

# Configure auto-save
configure_auto_save(
    strategy="backup",
    backup_dir="/backups",
    max_backups=10
)

Advanced Installation Options

Alternative Installation Methods

Using pip

git clone https://github.com/santoshray02/csv-editor.git
cd csv-editor
pip install -e .

Using pipx (Global)

pipx install git+https://github.com/santoshray02/csv-editor.git

From GitHub (Recommended)

# Install latest version
pip install git+https://github.com/santoshray02/csv-editor.git

# Or using uv
uv pip install git+https://github.com/santoshray02/csv-editor.git

# Install specific version
pip install git+https://github.com/santoshray02/csv-editor.git@v1.0.1

🔧 Technical Details

Development

Running Tests

uv run test           # Run tests
uv run test-cov       # With coverage
uv run all-checks     # Format, lint, type-check, test

Project Structure

csv-editor/
├── src/csv_editor/   # Core implementation
│   ├── tools/        # MCP tool implementations
│   ├── models/       # Data models
│   └── server.py     # MCP server
├── tests/            # Test suite
├── examples/         # Usage examples
└── docs/            # Documentation

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Quick Contribution Guide

Fork the repository
Create a feature branch
Make your changes with tests
Run uv run all-checks
Submit a pull request

📈 Roadmap

[ ] SQL query interface
[ ] Real-time collaboration
[ ] Advanced visualizations
[ ] Machine learning integrations
[ ] Cloud storage support
[ ] Performance optimizations for 10GB+ files

💬 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Wiki

📄 License

MIT License - see LICENSE file

🙏 Acknowledgments

Built with:

FastMCP - Fast Model Context Protocol
Pandas - Data manipulation
NumPy - Numerical computing

Ready to supercharge your AI's data capabilities? Get started in 2 minutes →

get_statistics

Get the statistical summary of numerical columns.

Parameters

session_id : str*

Description

Session identifier

Parameters

columns : Optional[List[str]]*

Description

Specific columns to analyze (None means all numerical columns).

Parameters

include_percentiles : bool*

Description

Whether to include percentile values.

get_column_statistics

Get detailed statistical information for a specific column.

Parameters

session_id : str*

Description

Session identifier

Parameters

column : str*

Description

The name of the column to analyze.

get_correlation_matrix

Calculate the correlation matrix of numerical columns.

Parameters

session_id : str*

Description

Session identifier

Parameters

method : str*

Description

Correlation method ('pearson','spearman', 'kendall').

Parameters

columns : Optional[List[str]]*

Description

Specific columns to include (None means all numerical columns).

Parameters

min_correlation : Optional[float]*

Description

Filter threshold, only show correlations above this threshold.

group_by_aggregate

Group data and apply aggregation functions.

Parameters

session_id : str*

Description

Session identifier

Parameters

group_by : List[str]*

Description

Columns used for grouping.

Parameters

aggregations : Dict[str, Union[str, List[str]]]*

Description

A dictionary mapping column names to aggregation functions.

get_value_counts

Get the value counts of a column.

Parameters

session_id : str*

Description

Session identifier

Parameters

column : str*

Description

The name of the column to count values.

Parameters

normalize : bool*

Description

Return proportions instead of counts.

Parameters

sort : bool*

Description

Sort by frequency.

Parameters

ascending : bool*

Description

Sorting order.

Parameters

top_n : Optional[int]*

Description

Only return the top N values.

detect_outliers

Detect outliers in numerical columns.

Parameters

session_id : str*

Description

Session identifier

Parameters

columns : Optional[List[str]]*

Description

Columns to check (None means all numerical columns).

Parameters

method : str*

Description

Detection method ('iqr', 'zscore', 'isolation_forest').

Parameters

threshold : float*

Description

Threshold for outlier detection (1.5 for IQR, 3 for z - score).

profile_data

Generate a comprehensive data profile analysis.

Parameters

session_id : str*

Description

Session identifier

Parameters

include_correlations : bool*

Description

Whether to include correlation analysis.

Parameters

include_outliers : bool*

Description

Whether to include outlier detection.

Notion Api MCP

Certified

A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.

Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.

The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.

TypeScript

17.5K

4.3 points

Duckduckgo MCP Server

Certified

The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.

Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.

UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.

24.3K

5 points

Gmail MCP Server

A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.

Context7 MCP is a service that provides real-time, version-specific documentation and code examples for AI programming assistants. It is directly integrated into prompts through the Model Context Protocol to solve the problem of LLMs using outdated information.

TypeScript

75.7K

4.7 points

Zhiqi Future, Your AI Solution Think Tank

English 简体中文繁體中文にほんご

Csv Editor

Installation

Tools List

Content Details

Alternatives

Installation

🚀 CSV Editor - AI-Powered CSV Processing via MCP

🚀 Quick Start

Installing via Smithery

Fastest Installation (Recommended)

Configure Your AI Assistant

✨ Features

The Problem

The Solution

Key Differentiators

💻 Usage Examples

Basic Usage

Advanced Usage

📊 Data Analyst Workflow

🏭 ETL Pipeline

🔍 Data Quality Assurance

📚 Documentation

Available Tools

I/O Operations

Data Manipulation

Analysis

Validation

Auto-Save & History

Configuration

Environment Variables

Auto-Save Strategies

Advanced Installation Options

Using pip

Using pipx (Global)

From GitHub (Recommended)

🔧 Technical Details

Development

Running Tests

Project Structure

🤝 Contributing

Quick Contribution Guide

📈 Roadmap

💬 Support

📄 License

🙏 Acknowledgments

Alternatives