CodeGraph Rust MCP Server: CLI-Integrated Code Analysis Tool for Multilingual Parsing and Vector Search

Codegraph Rust

CodeGraph CLI is a high-performance MCP server management tool that provides code repository indexing, semantic search, and architecture analysis functions, supporting multi-language parsing and vector search.

Developer tools Artificial intelligence chatbots #Code Analysis #Semantic Search #MCP Service #Architecture Visualization .Rust

rating : 2 points

downloads : 5.5K

update time : 2025-09-18

Open Site

What is CodeGraph CLI MCP Server?

CodeGraph is a powerful command-line tool that combines MCP (Model Context Protocol) server management with advanced code analysis capabilities. It provides a unified interface to index projects, manage embedding vectors, and run the MCP server through various transport options.

How to use CodeGraph?

Through a simple command-line interface, you can initialize projects, index code repositories, start the MCP server, and perform intelligent code searches. It supports both STDIO and HTTP transport modes and can be integrated with various AI assistants.

Use Cases

Suitable for code repository analysis, architecture review, code search, AI assistant integration, continuous integration pipelines, as well as educational and research scenarios. It is particularly suitable for large projects that require in-depth code understanding.

Main Features

Multi-language Support

Supports code parsing and analysis of multiple programming languages such as Rust, Python, JavaScript, TypeScript, Go, Java, and C++.

Dual Transport Modes

Supports both STDIO and HTTP transport modes, enabling seamless integration with tools such as Claude Desktop and VS Code.

Semantic Search

Intelligent semantic search based on vector embeddings, capable of understanding the semantic meaning of code rather than just keyword matching.

Local Embedding Vectors

Supports local HuggingFace models without the need for external API calls, protecting code privacy and reducing latency.

Real-time Indexing

The file monitoring mode can update the index in real-time, ensuring the timeliness and accuracy of search results.

Architecture Analysis

Automatically analyzes code component relationships, dependencies, and architecture patterns, and visualizes the code structure.

Advantages

High performance: Parses 170K lines of Rust code in 0.49 seconds.

Privacy protection: Supports local models, and code does not need to be uploaded to the cloud.

Flexible integration: Supports multiple AI assistants and development tools.

Easy to use: Concise command-line interface with rich configuration options.

Cross-platform: Supports Linux, macOS, and Windows systems.

Limitations

Initial setup requires downloading model files (approximately 100 - 500MB).

Indexing large code repositories requires more memory (8GB+ is recommended).

Some advanced features require additional system dependencies.

GPU acceleration requires specific hardware support.

How to Use

Install CodeGraph

Install via Cargo or download pre-compiled binary files.

Initialize the Project

Initialize the CodeGraph configuration in the project directory.

Index the Code Repository

Parse and index code files in the project.

Start the MCP Server

Start the server for AI assistants to access code information.

Search for Code

Use semantic search to find relevant code snippets.

Usage Examples

Code Understanding and Review

Quickly understand the structure and key components of large code repositories and quickly locate relevant code during code reviews.

Architecture Analysis

Analyze the architecture patterns of the project, identify component dependencies, and potential architecture issues.

Code Search and Reuse

Quickly find reusable code snippets or implementations of similar functions.

AI Assistant Integration

Provide code context for AI programming assistants to improve the accuracy of code generation and question answering.

Frequently Asked Questions

Which programming languages does CodeGraph support?

Is an internet connection required?

How to integrate with Claude Desktop?

How much memory is required to index a large code repository?

Does it support GPU acceleration?

How to update the index?

Related Resources

Official Documentation

Complete API documentation and configuration guide.

GitHub Repository

Source code and issue tracking.

MCP Protocol Specification

Official documentation for the Model Context Protocol.

Example Configurations

Configuration examples for various use cases.

Community Forum

User discussions and question answering.

🚀 CodeGraph CLI MCP Server

A high-performance CLI tool for managing MCP servers and indexing codebases with advanced architectural analysis capabilities.

🚀 Quick Start

1. Initialize a New Project

# Initialize CodeGraph in current directory
codegraph init

# Initialize with project name
codegraph init --name my-project

2. Index Your Codebase

# Index current directory
codegraph index .

# Index with specific languages
codegraph index . --languages rust,python,typescript

# Or with more options in Osx
RUST_LOG=info,codegraph_vector=debug codegraph index . --workers 10 --batch-size 256 --max-seq-len 512 --force                                                    

# Index with file watching
codegraph index . --watch

3. Start MCP Server

# Start with STDIO transport (default)
codegraph start stdio

# Start with HTTP transport
codegraph start http --port 3000

# Start with dual transport
codegraph start dual --port 3000

(Optional) Start with Local Embeddings

# Build with the feature (see installation step above), then:
export CODEGRAPH_EMBEDDING_PROVIDER=local
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2
cargo run -p codegraph-api --features codegraph-vector/local-embeddings

4. Search Your Code

# Semantic search
codegraph search "authentication handler"

# Exact match search
codegraph search "fn authenticate" --search-type exact

# AST-based search
codegraph search "function with async keyword" --search-type ast

✨ Features

Core Features

Project Indexing
- Multi-language support (Rust, Python, JavaScript, TypeScript, Go, Java, C++)
- Incremental indexing with file watching
- Parallel processing with configurable workers
- Smart caching for improved performance
MCP Server Management
- STDIO transport for direct communication
- HTTP streaming with SSE support
- Dual transport mode for maximum flexibility
- Background daemon mode with PID management
Code Search
- Semantic search using embeddings
- Exact match and fuzzy search
- Regex and AST-based queries
- Configurable similarity thresholds
Architecture Analysis
- Component relationship mapping
- Dependency analysis
- Code pattern detection
- Architecture visualization support

📦 Installation

Method 1: Install from Source

# Clone the repository
git clone https://github.com/jakedismo/codegraph-cli-mcp.git
cd codegraph-cli-mcp

# Build the project
cargo build --release

# Install globally
cargo install --path crates/codegraph-mcp

# Verify installation
codegraph --version

Enabling Local Embeddings (Optional)

If you want to use a local embedding model (Hugging Face) instead of remote providers:

Build with the local embeddings feature for crates that use vector search (the API and/or CLI server):

# Build API with local embeddings enabled
cargo build -p codegraph-api --features codegraph-vector/local-embeddings

# (Optional) If your CLI server crate depends on vector features, enable similarly:
cargo build -p core-rag-mcp-server --features codegraph-vector/local-embeddings

Set environment variables to switch the provider at runtime:

export CODEGRAPH_EMBEDDING_PROVIDER=local
# Optional: choose a specific HF model (must provide safetensors weights)
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2

Run as usual (the first run will download model files from Hugging Face and cache them locally):

cargo run -p codegraph-api --features codegraph-vector/local-embeddings

Model cache locations:

Default Hugging Face cache: ~/.cache/huggingface (or $HF_HOME) via hf-hub
You can pre-populate this cache to run offline after the first download

Method 2: Install Pre-built Binary

# Download the latest release
curl -L https://github.com/jakedismo/codegraph-cli-mcp/releases/latest/download/codegraph-$(uname -s)-$(uname -m).tar.gz | tar xz

# Move to PATH
sudo mv codegraph /usr/local/bin/

# Verify installation
codegraph --version

Method 3: Using Cargo

# Install directly from crates.io (when published)
cargo install codegraph-mcp

# Verify installation
codegraph --version

💻 Usage Examples

Basic Usage

# Initialize a project
codegraph init --name my-project

# Index the project
codegraph index .

# Start the MCP server
codegraph start stdio

# Search the code
codegraph search "authentication handler"

Advanced Usage

# Index with specific languages and more options
RUST_LOG=info,codegraph_vector=debug codegraph index . --workers 10 --batch-size 256 --max-seq-len 512 --force --languages rust,python,typescript

# Start the MCP server with HTTP transport and port 3000
codegraph start http --port 3000

# Search with AST-based query
codegraph search "function with async keyword" --search-type ast

📚 Documentation

Overview

CodeGraph is a powerful CLI tool that combines MCP (Model Context Protocol) server management with sophisticated code analysis capabilities. It provides a unified interface for indexing projects, managing embeddings, and running MCP servers with multiple transport options. All you now need is an Agent(s) to create your very own deep code and project knowledge synthehizer system!

Key Capabilities

🔍 Advanced Code Analysis: Parse and analyze code across multiple languages using Tree-sitter
🚄 Dual Transport Support: Run MCP servers with STDIO, HTTP, or both simultaneously
🎯 Vector Search: Semantic code search using FAISS-powered vector embeddings
📊 Graph-Based Architecture: Navigate code relationships with RocksDB-backed graph storage
⚡ High Performance: Optimized for large codebases with parallel processing and batched embeddings
🔧 Flexible Configuration: Extensive configuration options for embedding models and performance tuning

RAW PERFORMANCE ✨✨✨

170K lines of rust code in 0.49sec! 21024 embeddings in 3:24mins! On M3 Pro 32GB Qdrant/all-MiniLM-L6-v2-onnx on CPU no Metal acceleration used!

Parsing completed: 353/353 files, 169397 lines in 0.49s (714.5 files/s, 342852 lines/s)
[00:03:24] [########################################] 21024/21024 Embeddings complete

Architecture

CodeGraph System Architecture
┌─────────────────────────────────────────────────────┐
│                   CLI Interface                     │
│                  (codegraph CLI)                    │
└─────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────┐
│                   Core Engine                       │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐  │
│  │   Parser    │  │  Graph Store │  │   Vector   │  │ 
│  │ (Tree-sittr)│  │  (RocksDB)   │  │   Search   │  │
│  └─────────────┘  └──────────────┘  │  (FAISS)   │  │
│                                     └────────────┘  │
└─────────────────────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────┐
│                  MCP Server Layer                   │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐  │
│  │    STDIO    │  │     HTTP     │  │    Dual    │  │
│  │  Transport  │  │  Transport   │  │    Mode    │  │
│  └─────────────┘  └──────────────┘  └────────────┘  │
└─────────────────────────────────────────────────────┘

Embeddings with ONNX Runtime (macOS)

Default provider: CPU EP. Works immediately with Homebrew onnxruntime.
Optional CoreML EP: Set CODEGRAPH_ONNX_EP=coreml to prefer CoreML when using an ONNX Runtime build that includes CoreML.
Fallback: If CoreML EP init fails, CodeGraph logs a warning and falls back to CPU.

How to use ONNX embeddings

# CPU-only (default)
export CODEGRAPH_EMBEDDING_PROVIDER=onnx
export CODEGRAPH_ONNX_EP=cpu
export CODEGRAPH_LOCAL_MODEL=/path/to/onnx-file

# CoreML (requires CoreML-enabled ORT build)
export CODEGRAPH_EMBEDDING_PROVIDER=onnx
export CODEGRAPH_ONNX_EP=coreml
export CODEGRAPH_LOCAL_MODEL=/path/to/onnx-file


# Install codegraph
cargo install --path crates/codegraph-mcp --features "embeddings,codegraph-vector/onnx,faiss"

Notes

ONNX Runtime on Apple platforms accelerates via CoreML, not Metal. If you need GPU acceleration on Apple Silicon, use CoreML where supported.
Some models/operators may still run on CPU if CoreML doesn’t support them.

Enabling CoreML feature at build time

The CoreML registration path is gated by the Cargo feature onnx-coreml in codegraph-vector.
Build with: cargo build -p codegraph-vector --features "onnx,onnx-coreml"
In a full workspace build, enable it via your consuming crate’s features or by adding: --features codegraph-vector/onnx,codegraph-vector/onnx-coreml.
You still need an ONNX Runtime library that was compiled with CoreML support; the feature only enables the registration call in our code.

Prerequisites

System Requirements

Operating System: Linux, macOS, or Windows
Rust: 1.75 or higher
Memory: Minimum 4GB RAM (8GB recommended for large codebases)
Disk Space: 1GB for installation + space for indexed data

Required Dependencies

# macOS
brew install cmake clang

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install cmake clang libssl-dev pkg-config

# Fedora/RHEL
sudo dnf install cmake clang openssl-devel

Optional Dependencies

FAISS (for vector search acceleration)
Local Embeddings (HuggingFace + Candle + ONNX/ORT(coreML) osx-metal/cuda/cpu)
- Enables on-device embedding generation (no external API calls)
- Downloads models from HuggingFace Hub on first run and caches them locally
- Internet access required for the initial model download (or pre-populate cache)
- Default runs on CPU; advanced GPU backends (CUDA/Metal) require appropriate hardware and drivers
CUDA (for GPU-accelerated embeddings)
Git (for repository integration)

Performance Benchmarks

Run repeatable, end-to-end benchmarks that measure indexing speed (with local embeddings + FAISS), vector search latency, and graph traversal throughput.

Build with performance features

Pick one of the local embedding backends and enable FAISS:

# Option A: ONNX Runtime (CoreML on macOS, CPU otherwise)
cargo install --path crates/codegraph-mcp --features "embeddings,codegraph-vector/onnx,faiss"

# Option B: Local HF + Candle (CPU/Metal/CUDA)
cargo install --path crates/codegraph-mcp --features "embeddings-local,faiss"

Configure local embedding backend

ONNX (CoreML/CPU):

export CODEGRAPH_EMBEDDING_PROVIDER=onnx
# macOS: use CoreML
export CODEGRAPH_ONNX_EP=coreml   # or cpu
export CODEGRAPH_LOCAL_MODEL=/path/to/model.onnx

Local HF + Candle (CPU/Metal/CUDA):

export CODEGRAPH_EMBEDDING_PROVIDER=local
# device: cpu | metal | cuda:<id>
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2

Run the benchmark

# Cold run (cleans .codegraph), warmup queries + timed trials
codegraph perf . \
  --langs rust,ts,go \
  --warmup 3 --trials 20 \
  --batch-size 128 --device metal \
  --clean --format json

What it measures

Indexing: total time to parse -> embed -> build FAISS (global + shards)
Embedding throughput: embeddings per second
Vector search: latency (avg/p50/p95) across repeated queries
Graph traversal: BFS depth=2 micro-benchmark

Sample output (numbers will vary by machine and codebase)

{
  "env": {
    "embedding_provider": "local",
    "device": "metal",
    "features": { "faiss": true, "embeddings": true }
  },
  "dataset": {
    "path": "/repo/large-project",
    "languages": ["rust","ts","go"],
    "files": 18234,
    "lines": 2583190
  },
  "indexing": {
    "total_seconds": 186.4,
    "embeddings": 53421,
    "throughput_embeddings_per_sec": 286.6
  },
  "vector_search": {
    "queries": 100,
    "latency_ms": { "avg": 18.7, "p50": 12.3, "p95": 32.9 }
  },
  "graph": {
    "bfs_depth": 2,
    "visited_nodes": 1000,
    "elapsed_ms": 41.8
  }
}

Tips for reproducibility

Use --clean for cold start numbers, and run a second time for warm cache numbers.
Close background processes that may compete for CPU/GPU.
Pin versions: rustc --version, FAISS build, and the embedding model.
Record the host: CPU/GPU, RAM, storage, OS version.

CLI Commands

Global Options

codegraph [OPTIONS] <COMMAND>

Options:
  -v, --verbose         Enable verbose logging
  --config <PATH>       Configuration file path
  -h, --help           Print help
  -V, --version        Print version

Command Reference

init - Initialize CodeGraph Project

codegraph init [OPTIONS] [PATH]

Arguments:
  [PATH]               Project directory (default: current directory)

Options:
  --name <NAME>        Project name
  --non-interactive    Skip interactive setup

start - Start MCP Server

codegraph start <TRANSPORT> [OPTIONS]

Transports:
  stdio                STDIO transport (default)
  http                 HTTP streaming transport
  dual                 Both STDIO and HTTP

Options:
  --config <PATH>      Server configuration file
  --daemon             Run in background
  --pid-file <PATH>    PID file location

HTTP Options:
  -h, --host <HOST>    Host to bind (default: 127.0.0.1)
  -p, --port <PORT>    Port to bind (default: 3000)
  --tls                Enable TLS/HTTPS
  --cert <PATH>        TLS certificate file
  --key <PATH>         TLS key file
  --cors               Enable CORS

stop - Stop MCP Server

codegraph stop [OPTIONS]

Options:
  --pid-file <PATH>    PID file location
  -f, --force          Force stop without graceful shutdown

status - Check Server Status

codegraph status [OPTIONS]

Options:
  --pid-file <PATH>    PID file location
  -d, --detailed       Show detailed status information

index - Index Project

codegraph index <PATH> [OPTIONS]

Arguments:
  <PATH>               Path to project directory

Options:
  -l, --languages <LANGS>     Languages to index (comma-separated)
  --exclude <PATTERNS>        Exclude patterns (gitignore format)
  --include <PATTERNS>        Include only these patterns
  -r, --recursive             Recursively index subdirectories
  --force                     Force reindex
  --watch                     Watch for changes
  --workers <N>               Number of parallel workers (default: 4)

search - Search Indexed Code

codegraph search <QUERY> [OPTIONS]

Arguments:
  <QUERY>              Search query

Options:
  -t, --search-type <TYPE>    Search type (semantic|exact|fuzzy|regex|ast)
  -l, --limit <N>             Maximum results (default: 10)
  --threshold <FLOAT>         Similarity threshold 0.0-1.0 (default: 0.7)
  -f, --format <FORMAT>       Output format (human|json|yaml|table)

config - Manage Configuration

codegraph config <ACTION> [OPTIONS]

Actions:
  show                 Show current configuration
  set <KEY> <VALUE>    Set configuration value
  get <KEY>            Get configuration value
  reset                Reset to defaults
  validate             Validate configuration

Options:
  --json               Output as JSON (for 'show')
  -y, --yes            Skip confirmation (for 'reset')

stats - Show Statistics

codegraph stats [OPTIONS]

Options:
  --index              Show index statistics
  --server             Show server statistics
  --performance        Show performance metrics
  -f, --format <FMT>   Output format (table|json|yaml|human)

clean - Clean Resources

codegraph clean [OPTIONS]

Options:
  --index              Clean index database
  --vectors            Clean vector embeddings
  --cache              Clean cache files
  --all                Clean all resources
  -y, --yes            Skip confirmation prompt

Configuration

Configuration File Structure

Create a .codegraph/config.toml file:

# General Configuration
[general]
project_name = "my-project"
version = "1.0.0"
log_level = "info"

# Indexing Configuration
[indexing]
languages = ["rust", "python", "typescript"]
exclude_patterns = ["**/node_modules/**", "**/target/**", "**/.git/**"]
include_patterns = ["src/**", "lib/**"]
recursive = true
workers = 4
watch_enabled = false
incremental = true

# Embedding Configuration
[embedding]
model = "openai"  # Options: openai, local, custom
dimension = 1536
batch_size = 100
cache_enabled = true
cache_size_mb = 500

# Vector Search Configuration
[vector]
index_type = "flat"  # Options: flat, ivf, hnsw
nprobe = 10
similarity_metric = "cosine"  # Options: cosine, euclidean, inner_product

# Database Configuration
[database]
path = "~/.codegraph/db"
cache_size_mb = 128
compression = true
write_buffer_size_mb = 64

# Server Configuration
[server]
default_transport = "stdio"
http_host = "127.0.0.1"
http_port = 3000
enable_tls = false
cors_enabled = true
max_connections = 100

# Performance Configuration
[performance]
max_file_size_kb = 1024
parallel_threads = 8
memory_limit_mb = 2048
optimization_level = "balanced"  # Options: speed, balanced, memory

Environment Variables

# Override configuration with environment variables
export CODEGRAPH_LOG_LEVEL=debug
export CODEGRAPH_DB_PATH=/custom/path/db
export CODEGRAPH_EMBEDDING_MODEL=local
export CODEGRAPH_HTTP_PORT=8080

Embedding Model Configuration

OpenAI Embeddings

[embedding.openai]
api_key = "${OPENAI_API_KEY}"  # Use environment variable
model = "text-embedding-3-large"
dimension = 3072

Local Embeddings

[embedding.local]
model_path = "~/.codegraph/models/codestral.gguf"
device = "cpu"  # Options: cpu, cuda, metal
context_length = 8192

User Workflows

Workflow 1: Complete Project Setup and Analysis

# Step 1: Initialize project
codegraph init --name my-awesome-project

# Step 2: Configure settings
codegraph config set embedding.model local
codegraph config set performance.optimization_level speed

# Step 3: Index the codebase
codegraph index . --languages rust,python --recursive

# Step 4: Start MCP server
codegraph start http --port 3000 --daemon

# Step 5: Search and analyze
codegraph search "database connection" --limit 20
codegraph stats --index --performance

Workflow 2: Continuous Development with Watch Mode

# Start indexing with watch mode
codegraph index . --watch --workers 8 &

# Start MCP server in dual mode
codegraph start dual --daemon

# Monitor changes
codegraph status --detailed

# Search while developing
codegraph search "TODO" --search-type exact

Workflow 3: Integration with AI Tools

# Start MCP server for Claude Desktop or VS Code
codegraph start stdio

# Configure for AI assistant integration
cat > ~/.codegraph/mcp-config.json << EOF
{
  "name": "codegraph-server",
  "version": "1.0.0",
  "tools": [
    {
      "name": "analyze_architecture",
      "description": "Analyze codebase architecture"
    },
    {
      "name": "find_patterns",
      "description": "Find code patterns and anti-patterns"
    }
  ]
}
EOF

Workflow 4: Large Codebase Optimization

# Optimize for large codebases
codegraph config set performance.memory_limit_mb 8192
codegraph config set vector.index_type ivf
codegraph config set database.compression true

# Index with optimizations
codegraph index /path/to/large/project \
  --workers 16 \
  --exclude "**/test/**,**/vendor/**"

# Use batch operations
codegraph search "class.*Controller" --search-type regex --limit 100

Integration Guide

Integrating with Claude Desktop

Add to Claude Desktop configuration:

{
  "mcpServers": {
    "codegraph": {
      "command": "codegraph",
      "args": ["start", "stdio"],
      "env": {
        "CODEGRAPH_CONFIG": "~/.codegraph/config.toml"
      }
    }
  }
}

Restart Claude Desktop to load the MCP server

Integrating with VS Code

Install the MCP extension for VS Code
Add to VS Code settings:

{
  "mcp.servers": {
    "codegraph": {
      "command": "codegraph",
      "args": ["start", "stdio"],
      "rootPath": "${workspaceFolder}"
    }
  }
}

API Integration

import requests
import json

# Connect to HTTP MCP server
base_url = "http://localhost:3000"

# Index a project
response = requests.post(f"{base_url}/index", json={
    "path": "/path/to/project",
    "languages": ["python", "javascript"]
})

# Search code
response = requests.post(f"{base_url}/search", json={
    "query": "async function",
    "limit": 10
})

results = response.json()

Using with CI/CD

# GitHub Actions example
name: CodeGraph Analysis

on: [push, pull_request]

jobs:
  analyze:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Install CodeGraph
        run: |
          cargo install codegraph-mcp
      
      - name: Index Codebase
        run: |
          codegraph init --non-interactive
          codegraph index . --languages rust,python
      
      - name: Run Analysis
        run: |
          codegraph stats --index --format json > analysis.json
      
      - name: Upload Results
        uses: actions/upload-artifact@v2
        with:
          name: codegraph-analysis
          path: analysis.json

Troubleshooting

Common Issues and Solutions

Issue: Server fails to start

# Check if port is already in use
lsof -i :3000

# Kill existing process
codegraph stop --force

# Start with different port
codegraph start http --port 3001

Issue: Indexing is slow

# Increase workers
codegraph index . --workers 16

# Exclude unnecessary files
codegraph index . --exclude "**/node_modules/**,**/dist/**"

# Use incremental indexing
codegraph config set indexing.incremental true

Issue: Out of memory during indexing

# Reduce batch size
codegraph config set embedding.batch_size 50

# Limit memory usage
codegraph config set performance.memory_limit_mb 1024

# Use streaming mode
codegraph index . --streaming

Issue: Vector search returns poor results

# Adjust similarity threshold
codegraph search "query" --threshold 0.5

# Re-index with better embeddings
codegraph config set embedding.model openai
codegraph index . --force

# Use different search type
codegraph search "query" --search-type fuzzy

Issue: Hugging Face model fails to download

# Ensure you have internet access and the model name is correct
export CODEGRAPH_LOCAL_MODEL=sentence-transformers/all-MiniLM-L6-v2

# If the model is private, set a HF token (if required by your environment)
export HF_TOKEN=your_hf_access_token

# Clear/inspect cache (default): ~/.cache/huggingface
ls -lah ~/.cache/huggingface

# Note: models must include safetensors weights; PyTorch .bin-only models are not supported by the local loader here

Issue: Local embeddings are slow

# Reduce batch size via config or environment (CPU defaults prioritize stability)
# Consider using a smaller model (e.g., all-MiniLM-L6-v2) or enabling GPU backends.

# For Apple Silicon (Metal) or CUDA, additional wiring can be enabled in config.
# Current default uses CPU; contact maintainers to enable device selectors in your environment.

Debug Mode

Enable debug logging for troubleshooting:

# Set debug log level
export RUST_LOG=debug
codegraph --verbose index .

# Check logs
tail -f ~/.codegraph/logs/codegraph.log

Health Checks

# Check system health
codegraph status --detailed

# Validate configuration
codegraph config validate

# Test database connection
codegraph test db

# Verify embeddings
codegraph test embeddings

🔧 Technical Details

Architecture

The CodeGraph system consists of a CLI interface, a core engine, and an MCP server layer. The core engine includes a parser, a graph store, and a vector search module. The MCP server layer supports STDIO, HTTP, or dual transport modes.

Performance Optimization

Parallel processing and batched embeddings are used to optimize performance for large codebases.
Smart caching is implemented to improve indexing speed.
The system is optimized for multi-language support using Tree-sitter.