🚀 Token Optimizer MCP
Intelligent token optimization through caching, compression, and smart tooling for Claude Code and Claude Desktop
🚀 Quick Start
Token Optimizer MCP is a Model Context Protocol (MCP) server that reduces context window usage by 60 - 90% through intelligent caching, compression, and smart tool replacements. By storing compressed content externally in SQLite and providing optimized alternatives to standard tools, the server helps you maximize your available context window.
Production Results: Achieved 60 - 90% token reduction across 38,000+ operations in real - world usage.
✨ Features
- Smart Tool Replacements: Automatic optimization for Read, Grep, Glob, and more.
- Context Window Optimization: Store content externally to free up context space.
- High Compression: Brotli compression (2 - 4x typical, up to 82x for repetitive content).
- Persistent Caching: SQLite - based cache that persists across sessions.
- Accurate Token Counting: Uses tiktoken for precise token measurements.
- 61 Specialized Tools: File operations, API caching, database optimization, monitoring, and more.
- Zero External Dependencies: Completely offline operation.
- Production Ready: Built with TypeScript for reliability.
📦 Installation
Quick Install (Recommended)
Windows
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
npm install -g @ooples/token-optimizer-mcp
macOS / Linux
npm install -g @ooples/token-optimizer-mcp
That's it! The postinstall script will automatically:
- ✅ Install token-optimizer-mcp globally via npm.
- ✅ Auto-detect and configure all installed AI tools (Claude Desktop, Cursor, Cline, etc.).
- ✅ Set up automatic token optimization on every tool call.
- ✅ Configure workspace trust and execution permissions.
Result: 60 - 90% token reduction across all operations!
Note: If automatic installation is skipped (e.g., in CI environments), you can manually run the installer:
- Windows:
powershell -ExecutionPolicy Bypass -File install-hooks.ps1
- macOS/Linux:
bash install-hooks.sh
Manual Configuration
For detailed platform-specific installation instructions, see docs/HOOKS-INSTALLATION.md.
💻 Usage Examples
Basic Caching
const result = await optimize_text({
text: "Large API response or file content...",
key: "cache-key",
quality: 11
});
const cached = await get_cached({ key: "cache-key" });
Smart File Reading
await smart_read({ path: "/src/app.ts" });
await smart_read({ path: "/src/app.ts" });
API Caching
await smart_api_fetch({
method: "GET",
url: "https://api.example.com/data",
ttl: 300
});
await smart_api_fetch({
method: "GET",
url: "https://api.example.com/data"
});
Session Analysis
await get_session_stats({});
await analyze_project_tokens({
projectPath: "/path/to/project"
});
📚 Documentation
- Detailed Tool Reference - Complete documentation for all 61 tools.
- Installation Guide - Platform-specific installation instructions.
- Contributing Guide - Development setup and contribution guidelines.
🔧 Technical Details
Available Tools (65 Total)
Core Caching & Optimization (8 tools)
Click to expand
- optimize_text - Compress and cache text (primary tool for token reduction).
- get_cached - Retrieve previously cached text.
- compress_text - Compress text using Brotli.
- decompress_text - Decompress Brotli-compressed text.
- count_tokens - Count tokens using tiktoken (GPT - 4 tokenizer).
- analyze_optimization - Analyze text and get optimization recommendations.
- get_cache_stats - View cache hit rates and compression ratios.
- clear_cache - Clear all cached data.
Usage Example:
optimize_text({
text: "Large API response or file content...",
key: "api-response-key",
quality: 11
})
Smart File Operations (10 tools)
Click to expand
Optimized replacements for standard file tools with intelligent caching and diff - based updates:
- smart_read - Read files with 80% token reduction through caching and diffs.
- smart_write - Write files with verification and change tracking.
- smart_edit - Line-based file editing with diff-only output (90% reduction).
- smart_grep - Search file contents with match-only output (80% reduction).
- smart_glob - File pattern matching with path-only results (75% reduction).
- smart_diff - Git diffs with diff-only output (85% reduction).
- smart_branch - Git branch listing with structured JSON (60% reduction).
- smart_log - Git commit history with smart filtering (75% reduction).
- smart_merge - Git merge management with conflict analysis (80% reduction).
- smart_status - Git status with status-only output (70% reduction).
Usage Example:
smart_read({ path: "/path/to/file.ts" })
API & Database Operations (10 tools)
Click to expand
Intelligent caching and optimization for external data sources:
- smart_api_fetch - HTTP requests with caching and retry logic (83% reduction on cache hits).
- smart_cache_api - API response caching with TTL/ETag/event-based strategies.
- smart_database - Database queries with connection pooling and caching (83% reduction).
- smart_sql - SQL query analysis with optimization suggestions (83% reduction).
- smart_schema - Database schema analysis with intelligent caching.
- smart_graphql - GraphQL query optimization with complexity analysis (83% reduction).
- smart_rest - REST API analysis with endpoint discovery (83% reduction).
- smart_orm - ORM query optimization with N + 1 detection (83% reduction).
- smart_migration - Database migration tracking (83% reduction).
- smart_websocket - WebSocket connection management with message tracking.
Usage Example:
smart_api_fetch({
method: "GET",
url: "https://api.example.com/data",
ttl: 300
})
Build & Test Operations (10 tools)
Click to expand
Development workflow optimization with intelligent caching:
- smart_build - TypeScript builds with diff-based change detection.
- smart_test - Test execution with incremental test selection.
- smart_lint - ESLint with incremental analysis and auto-fix.
- smart_typecheck - TypeScript type checking with caching.
- smart_install - Package installation with dependency analysis.
- smart_docker - Docker operations with layer analysis.
- smart_logs - Log aggregation with pattern filtering.
- smart_network - Network diagnostics with anomaly detection.
- smart_processes - Process monitoring with resource tracking.
- smart_system_metrics - System resource monitoring with performance recommendations.
Usage Example:
smart_test({
onlyChanged: true,
coverage: true
})
Advanced Caching (10 tools)
Click to expand
Enterprise-grade caching strategies with 87 - 92% token reduction:
- smart_cache - Multi-tier cache (L1/L2/L3) with 6 eviction strategies (90% reduction).
- cache_warmup - Intelligent cache pre-warming with schedule support (87% reduction).
- cache_analytics - Real-time dashboards and trend analysis (88% reduction).
- cache_benchmark - Performance testing and strategy comparison (89% reduction).
- cache_compression - 6 compression algorithms with adaptive selection (89% reduction).
- cache_invalidation - Dependency tracking and pattern-based invalidation (88% reduction).
- cache_optimizer - ML-based recommendations and bottleneck detection (89% reduction).
- cache_partition - Sharding and consistent hashing (87% reduction).
- cache_replication - Distributed replication with conflict resolution (88% reduction).
- predictive_cache - ML-based predictive caching with ARIMA/LSTM (91% reduction).
Usage Example:
smart_cache({
operation: "configure",
evictionStrategy: "LRU",
l1MaxSize: 1000,
l2MaxSize: 10000
})
Monitoring & Dashboards (7 tools)
Click to expand
Comprehensive monitoring with 88 - 92% token reduction through intelligent caching:
- alert_manager - Multi-channel alerting (email, Slack, webhook) with routing (89% reduction).
- metric_collector - Time-series metrics with multi-source support (88% reduction).
- monitoring_integration - External platform integration (Prometheus, Grafana, Datadog) (87% reduction).
- custom_widget - Dashboard widgets with template caching (88% reduction).
- data_visualizer - Interactive visualizations with SVG optimization (92% reduction).
- health_monitor - System health checks with state compression (91% reduction).
- log_dashboard - Log analysis with pattern detection (90% reduction).
Usage Example:
alert_manager({
operation: "create-alert",
alertName: "high-cpu-usage",
channels: ["slack", "email"],
threshold: { type: "above", value: 80 }
})
System Operations (6 tools)
Click to expand
System-level operations with smart caching:
- smart_cron - Scheduled task management (cron/Windows Task Scheduler) (85% reduction).
- smart_user - User and permission management across platforms (86% reduction).
- smart_ast_grep - Structural code search with AST indexing (83% reduction).
- get_session_stats - Session-level token usage statistics.
- analyze_project_tokens - Project-wide token analysis and cost estimation.
- optimize_session - Compress large file operations from current session.
Usage Example:
get_session_stats({})
How It Works
Token Analytics (4 tools)
Click to expand
Granular token usage analytics for pinpointing optimization opportunities:
- get_hook_analytics - Token usage breakdown by hook phase (PreToolUse, PostToolUse, etc.).
- get_action_analytics - Token usage breakdown by tool/action (Read, Write, Grep, etc.).
- get_mcp_server_analytics - Token usage breakdown by MCP server (token-optimizer, filesystem, etc.).
- export_analytics - Export analytics data in JSON or CSV format with filtering.
Usage Example:
get_hook_analytics({
startDate: "2025-01-01T00:00:00Z",
endDate: "2025-12-31T23:59:59Z"
})
get_action_analytics({})
export_analytics({
format: "csv",
hookPhase: "PreToolUse"
})
Key Features:
- Per-hook phase tracking (PreToolUse, PostToolUse, SessionStart, etc.).
- Per-action tracking (Read, Write, count_tokens, etc.).
- Per-MCP-server tracking (token-optimizer, filesystem, GitHub, etc.).
- Date range filtering.
- JSON and CSV export.
- Persistent storage with SQLite.
- Zero performance impact (async batched writes).
Global Hooks System (7-Phase Optimization)
When global hooks are installed, token-optimizer-mcp runs automatically on every tool call:
┌─────────────────────────────────────────────────────────────┐
│ Phase 1: PreToolUse - Tool Replacement │
│ ├─ Read → smart_read (80% token reduction) │
│ ├─ Grep → smart_grep (80% token reduction) │
│ └─ Glob → smart_glob (75% token reduction) │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 2: Input Validation - Cache Lookups │
│ └─ get_cached checks if operation was already done │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 3: PostToolUse - Output Optimization │
│ ├─ optimize_text for large outputs │
│ └─ compress_text for repeated content │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 4: Session Tracking │
│ └─ Log all operations to operations-{sessionId}.csv │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 5: UserPromptSubmit - Prompt Optimization │
│ └─ Optimize user prompts before sending to API │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 6: PreCompact - Pre-Compaction Optimization │
│ └─ Optimize before Claude Code compacts the conversation │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Phase 7: Metrics & Reporting │
│ └─ Track token reduction metrics and generate reports │
└─────────────────────────────────────────────────────────────┘
Production Performance
Based on 38,000+ operations in real-world usage:
| Tool Category |
Avg Token Reduction |
Cache Hit Rate |
| File Operations |
60 - 90% |
>80% |
| API Responses |
83 - 95% |
>75% |
| Database Queries |
83 - 90% |
>70% |
| Build/Test Output |
70 - 85% |
>65% |
Per-Session Savings: 300K - 700K tokens (worth $0.90 - $2.10 at $3/M tokens).
Technology Stack
- Runtime: Node.js 20+
- Language: TypeScript
- Database: SQLite (better-sqlite3)
- Token Counting: tiktoken (GPT - 4 tokenizer)
- Compression: Brotli (built-in Node.js)
- Caching: Multi-tier LRU/LFU/FIFO caching
- Protocol: MCP SDK (@modelcontextprotocol/sdk)
Supported AI Tools
The automated installer detects and configures token-optimizer-mcp for:
- ✅ Claude Code - CLI with global hooks integration
- ✅ Claude Desktop - Native desktop application
- ✅ Cursor IDE - AI-first code editor
- ✅ Cline - VS Code extension (formerly Claude Dev)
- ✅ GitHub Copilot - VS Code with MCP support
- ✅ Windsurf IDE - AI-powered development environment
No manual configuration needed - the installer automatically detects and configures all installed tools!
Performance Characteristics
- Compression Ratio: 2 - 4x typical (up to 82x for repetitive content).
- Context Window Savings: 60 - 90% average across all operations.
- Cache Hit Rate: >80% in typical usage.
- Operation Overhead: <10ms for cache operations (optimized from 50 - 70ms).
- Compression Speed: ~1ms per KB of text.
- Hook Overhead: <10ms per operation (7x improvement from in-memory optimizations).
Performance Optimizations
The PowerShell hooks have been optimized to reduce overhead from 50 - 70ms to <10ms through:
- In-Memory Session State: Session data kept in memory instead of disk I/O on every operation.
- Batched Log Writes: Operation logs buffered and flushed every 5 seconds or 100 operations.
- Lazy Persistence: Disk writes only occur when necessary (session end, optimization, reports).
Environment Variables
Control hook behavior with these environment variables:
Performance Controls
-
TOKEN_OPTIMIZER_USE_FILE_SESSION (default: false)
- Set to
true to revert to file-based session tracking (legacy mode).
- Use if you encounter issues with in-memory session state.
- Example:
$env:TOKEN_OPTIMIZER_USE_FILE_SESSION = "true"
-
TOKEN_OPTIMIZER_SYNC_LOG_WRITES (default: false)
- Set to
true to disable batched log writes.
- Forces immediate writes to disk (slower but more resilient).
- Use for debugging or if logs are being lost.
- Example:
$env:TOKEN_OPTIMIZER_SYNC_LOG_WRITES = "true"
-
TOKEN_OPTIMIZER_DEBUG_LOGGING (default: true)
- Set to
false to disable DEBUG-level logging.
- Reduces log file size and improves performance.
- INFO/WARN/ERROR logs still written.
- Example:
$env:TOKEN_OPTIMIZER_DEBUG_LOGGING = "false"
Development Path
TOKEN_OPTIMIZER_DEV_PATH
- Path to local development installation.
- Automatically set to
~/source/repos/token-optimizer-mcp if not specified.
- Override for custom development paths.
- Example:
$env:TOKEN_OPTIMIZER_DEV_PATH = "C:\dev\token-optimizer-mcp"
Performance Impact: Using in-memory mode (default) provides a 7x improvement in hook overhead:
- Before: 50 - 70ms per hook operation.
- After: <10ms per hook operation.
- 85% reduction in hook latency.
Monitoring Token Savings
Real-Time Session Monitoring
To view your actual token SAVINGS, use the get_session_stats tool:
await get_session_stats({});
Output includes:
- Total tokens saved (this is the actual savings amount!)
- Token reduction percentage (e.g., "60% reduction")
- Cache hit rate and compression ratios
- Breakdown by tool (Read, Grep, Glob, etc.)
- Top 10 most optimized operations with before/after comparison
Example Output:
{
"sessionId": "abc-123",
"totalTokensSaved": 125430,
"tokenReductionPercent": 68.2,
"originalTokens": 184000,
"optimizedTokens": 58570,
"cacheHitRate": 72.0,
"byTool": {
"smart_read": { "saved": 45000, "percent": 80 },
"smart_grep": { "saved": 32000, "percent": 75 }
}
}
Session Tracking Files
All operations are automatically tracked in session data files:
Location: ~/.claude-global/hooks/data/current-session.txt
Format:
{
"sessionId": "abc-123",
"sessionStart": "20251031-082211",
"totalOperations": 1250,
"totalTokens": 184000,
"lastOptimized": 1698765432,
"savings": {
"totalTokensSaved": 125430,
"tokenReductionPercent": 68.2,
"originalTokens": 184000,
"optimizedTokens": 58570,
"cacheHitRate": 42.5,
"compressionRatio": 0.32,
"lastUpdated": "20251031-092500"
}
}
New in v1.x: The savings object is now automatically updated every 10 operations, eliminating the need to manually call get_session_stats() for real-time monitoring. This provides instant visibility into token optimization performance.
How it works:
- Every 10 operations, the PowerShell hooks automatically call
get_cache_stats() MCP tool.
- Savings metrics are calculated from cache performance data (compression ratio, original vs compressed sizes).
- The session file is atomically updated with the latest savings data.
- If the MCP call fails, the update is skipped gracefully without blocking operations.
Note: For detailed per-operation analysis, use get_session_stats(). The session file provides high-level aggregate metrics.
Project-Wide Analysis
Analyze token usage across your entire project:
await analyze_project_tokens({
projectPath: "/path/to/project"
});
Provides:
- Total token cost estimation.
- Largest files by token count.
- Optimization opportunities.
- Cost projections at current API rates.
Cache Performance
Monitor cache hit rates and storage efficiency:
await get_cache_stats({});
Metrics:
- Total entries.
- Cache hit rate (%).
- Average compression ratio.
- Total storage saved.
- Most frequently accessed keys.
Troubleshooting
Common Issues and Solutions
Issue: "Invalid or malformed JSON" in Claude Code Settings
Symptom: Claude Code shows "Invalid Settings" error after running install-hooks.
Cause: UTF - 8 BOM (Byte Order Mark) was added to settings.json files.
Solution: Upgrade to v3.0.2+ which fixes the BOM issue:
npm install -g @ooples/token-optimizer-mcp@latest
If you're already on v3.0.2+, manually remove the BOM:
$content = Get-Content "~/.claude/settings.json" -Raw
$content = $content -replace '^\xEF\xBB\xBF', ''
$content | Set-Content "~/.claude/settings.json" -Encoding utf8NoBOM
sed -i '1s/^\xEF\xBB\xBF//' ~/.claude/settings.json
sed -i '' '1s/^\xef\xbb\xbf//' ~/.claude/settings.json
Issue: Hooks Not Working After Installation
Symptom: Token optimization not occurring automatically.
Diagnosis:
-
Check if hooks are installed:
Get-Content ~/.claude/settings.json | ConvertFrom-Json | Select-Object -ExpandProperty hooks
cat ~/.claude/settings.json | jq .hooks
-
Verify dispatcher.ps1 exists:
Test-Path ~/.claude-global/hooks/dispatcher.ps1
[ -f ~/.claude-global/hooks/dispatcher.sh ] && echo "Exists" || echo "Missing"
Solution: Re-run the installer:
powershell -ExecutionPolicy Bypass -File install-hooks.ps1
bash install-hooks.sh
Issue: Low Cache Hit Rate (<50%)
Symptom: Session stats show cache hit rate below 50%.
Causes:
- Working with many new files (expected).
- Cache was recently cleared.
- TTL (time-to-live) is too short.
Solutions:
-
Warm up the cache before starting work:
await cache_warmup({
paths: ["/path/to/frequently/used/files"],
recursive: true
});
-
Increase TTL for stable APIs:
await smart_api_fetch({
url: "https://api.example.com/data",
ttl: 3600
});
-
Check cache size limits:
await smart_cache({
operation: "configure",
l1MaxSize: 2000,
l2MaxSize: 20000
});
Issue: High Memory Usage
Symptom: Node.js process using excessive memory.
Cause: Large cache in memory (L1/L2 tiers).
Solution: Configure cache limits:
await smart_cache({
operation: "configure",
evictionStrategy: "LRU",
l1MaxSize: 500,
l2MaxSize: 5000
});
Or clear the cache:
await clear_cache({});
Issue: Slow First-Time Operations
Symptom: Initial Read/Grep/Glob operations are slow.
Cause: Cache is empty, building indexes.
Solution: This is expected behavior. Subsequent operations will be 80 - 90% faster.
To pre-warm the cache:
await cache_warmup({
paths: ["/src", "/tests", "/docs"],
recursive: true,
schedule: "startup"
});
Issue: "Permission denied" Errors on Windows
Symptom: Cannot write to cache or log files.
Cause: PowerShell execution policy or file permissions.
Solution:
-
Set execution policy:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
-
Check file permissions:
icacls "$env:USERPROFILE\.token-optimizer"
-
Re-run installer as Administrator if needed
Issue: Cache Files Growing Too Large
Symptom: ~/.token-optimizer/cache.db is >1GB.
Cause: Caching very large files or many API responses.
Solution:
-
Clear old entries:
await clear_cache({ olderThan: 7 });
-
Reduce cache retention:
await smart_cache({
operation: "configure",
defaultTTL: 3600
});
-
Manually delete cache (nuclear option):
rm -rf ~/.token-optimizer/cache.db
Getting Help
If you encounter issues not covered here:
- Check the hook logs:
~/.claude-global/hooks/logs/dispatcher.log
- Check session data:
~/.claude-global/hooks/data/current-session.txt
- File an issue: GitHub Issues
- Include debug logs.
- Include your OS and Node.js version.
- Include the output of
get_session_stats
Limitations
- Small Text: Best for content >500 characters (cache overhead on small snippets).
- One-Time Content: No benefit for content that won't be referenced again.
- Cache Storage: Automatic cleanup after 7 days to prevent disk usage issues.
- Token Counting: Uses GPT - 4 tokenizer (approximation for Claude, but close enough).
📄 License
MIT License - see LICENSE for details
Author
Built for optimal Claude Code token efficiency by the ooples team.