Engram MCP: Tool for Reliable Semantic Search Memory Storage in AI Agents

Engram

Engram is an event - sourcing memory system designed for AI agents, adopting an architecture without an LLM write path, and achieving reliable semantic search memory storage through local vector embeddings and DuckDB.

Knowledge management and memory Developer tools #AI memory system #Semantic search #Event sourcing #Local deployment .Go

rating : 2 points

downloads : 5.2K

update time : 2026-03-13

Open Site

What is Engram?

Engram is an innovative AI memory system specifically designed for AI assistants and agents. Different from traditional memory systems, Engram decouples memory storage and semantic search, ensuring that write operations are always reliable, even when there is no network connection or external APIs are unavailable. The core idea of Engram is to first reliably store memory fragments (called "events") and then perform semantic search. This design avoids relying on potentially unstable external services in the write path, ensuring that your AI assistant can always remember important information.

How to use Engram?

Using Engram is very simple: 1. Install the Engram binary or run it via Docker. 2. Configure the connection to the local Ollama service (for generating text vectors). 3. Integrate with Claude Desktop, Claude Code, or Cursor via the MCP protocol. 4. Your AI assistant can start storing and retrieving memories. Engram will automatically handle all technical details. You only need to focus on the conversation with your AI assistant.

Use cases

Engram is particularly suitable for the following scenarios: - **Long - term conversations**: Let the AI assistant remember important information across multiple sessions. - **Project collaboration**: Store project requirements, decisions, and progress. - **Personal assistant**: Remember your preferences, schedules, and important matters. - **Research assistant**: Organize research materials, references, and notes. - **Code development**: Remember code structures, API documentation, and development decisions.

Main features

Semantic search

Use vector similarity for intelligent search instead of simple keyword matching. This means that Engram can understand the semantic meaning of the query and find the most relevant content, even without an exact keyword match.

Graceful fallback

Engram can still work normally even if the vector generation service is unavailable. It will first store the text content and generate vectors after the service is restored, ensuring that write operations never fail.

Fast queries

Use DuckDB's HNSW indexing technology to achieve millisecond - level vector search responses. Even when a large amount of memory is stored, the search speed remains fast.

Local vector generation

All text vectors are generated locally via Ollama, without the need to call external APIs, protecting your privacy and reducing latency.

Single - file deployment

Engram is an independent executable file that does not require the installation of complex dependencies. Just download one file and configure a few environment variables to run it.

MCP native support

It can be directly integrated into Claude Desktop, Claude Code, and Cursor without additional configuration. Your AI assistant can use Engram just like a built - in function.

Advantages

100% reliable write operations: Does not rely on external LLM APIs. It can write as long as the database is working properly.

Privacy protection: All data is processed locally and will not be sent to the cloud.

Fast response: Local vector generation and search with extremely low latency.

Easy to use: Single - file deployment with simple configuration.

Cost - effective: No need to pay for API calls.

Offline work: Can store memories even without a network connection.

Limitations

Requires local resources: Needs to run the Ollama service, which occupies local computing resources.

Initial setup: Requires manual configuration of environment variables and integration.

Relatively basic functions: Focuses on reliable storage and search, without complex memory organization functions.

Depends on Ollama: If the Ollama service stops, new memories cannot generate vectors (but can still be stored).

How to use

Install Engram

Download the pre - compiled binary file suitable for your operating system from the GitHub Releases page, or build it from the source code.

Install and configure Ollama

Install Ollama and download the vector generation model. Ollama is a locally run large language model service.

Configure environment variables

Set the environment variables required for Engram to run, including the database path and Ollama connection information.

Integrate into the AI assistant

Add Engram to the MCP configuration of Claude Desktop, Claude Code, or Cursor.

Start using

Restart your AI assistant. Now it can use Engram to store and retrieve memories.

Usage examples

Remember project requirements

When discussing project requirements with the AI assistant, let the assistant remember important functional requirements and design decisions.

Cross - session memory

Let the AI assistant remember your personal preferences and work habits mentioned in multiple sessions.

Research material organization

When researching a topic, let the AI assistant organize and remember important reference materials and key points.

Meeting minutes

During a meeting discussion, let the AI assistant record important decisions, to - do items, and responsible persons.

Frequently Asked Questions

What is the difference between Engram and an ordinary note - taking app?

Do I need to run Ollama all the time?

Will Engram store my private conversations?

Can I access my memories from other devices?

Which AI assistants does Engram support?

If I have too many memories, will the search slow down?

Can I export or back up my memories?

Is Engram free?

Related resources

Official documentation

Complete Engram technical documentation and usage guide

GitHub repository

Source code, issue tracking, and release versions

MCP integration guide

Detailed MCP client integration instructions

Deployment guide

Deployment guides for Docker, Kubernetes, and production environments

Ollama official website

A tool for running large language models locally

Model Context Protocol

Official specification of the MCP protocol

🚀 Engram

Event-sourced memory system for AI agents. No LLM in the write path — just reliable episode storage with semantic search.

🚀 Quick Start

Prerequisites

Ollama running locally (or remotely) with an embedding model
Go 1.25+ (only if building from source)

Install

Option A: Download a pre-built binary

Download from the releases page for your platform:

Platform	Binary
macOS (Apple Silicon)	`engram-darwin-arm64`
macOS (Intel)	`engram-darwin-amd64`
Linux (x86_64)	`engram-linux-amd64`
Linux (ARM64)	`engram-linux-arm64`
Windows	`engram-windows-amd64.exe`

# macOS/Linux: make it executable
chmod +x engram-*
mv engram-* engram

Option B: Build from source

git clone https://github.com/OscillateLabsLLC/engram
cd engram

# Using just (recommended — install from https://github.com/casey/just)
just setup    # install deps, pull embedding model, build

# Or manually
go build -o engram ./cmd/engram/main.go

Pull the embedding model

ollama pull nomic-embed-text

Run

export DUCKDB_PATH="./engram.duckdb"
export OLLAMA_URL="http://localhost:11434"
export EMBEDDING_MODEL="nomic-embed-text"

./engram

✨ Features

Semantic search — results ranked by relevance using vector similarity, not recency
Graceful fallback — works even when the embedding service is unavailable
Fast queries — DuckDB HNSW indexing for sub-100ms vector search
Zero external APIs — all embeddings generated locally via Ollama
Single binary — portable across Linux, macOS, and Windows
MCP native — integrates directly with Claude Desktop, Claude Code, and Cursor

📚 Documentation

MCP Integration Guide - Client setup, available tools, troubleshooting
Deployment Guide - Docker Compose, Kubernetes, production deployment
Architecture - Technical deep dive into system design

📦 Installation

Configure via environment variables:

Variable	Description	Default
`DUCKDB_PATH`	Path to DuckDB database file	`./engram.duckdb`
`OLLAMA_URL`	Ollama API endpoint	`http://localhost:11434`
`EMBEDDING_MODEL`	Embedding model name	`nomic-embed-text`

See for a template.

💻 Usage Examples

MCP Client Integration

Engram integrates with Claude Desktop, Claude Code, and Cursor via the Model Context Protocol (MCP).

Quick Setup (Local)

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "engram-memory": {
      "command": "/absolute/path/to/engram",
      "args": [],
      "env": {
        "DUCKDB_PATH": "/absolute/path/to/engram.duckdb",
        "OLLAMA_URL": "http://localhost:11434",
        "EMBEDDING_MODEL": "nomic-embed-text"
      }
    }
  }
}

💡 Usage Tip

Use absolute paths. On macOS, run realpath engram to get the full path.

For detailed integration instructions, remote deployment, available MCP tools, and troubleshooting, see docs/mcp-integration.md.

Docker & Deployment

Quick Start (Development)

# macOS/Windows
just docker-up

# Linux
just docker-up-linux

For detailed deployment instructions including Docker Compose, Kubernetes, and production configurations, see docs/deployment.md.

🔧 Technical Details

engram/
├── cmd/engram/          # Entry point
├── internal/
│   ├── db/              # DuckDB operations + VSS
│   ├── embedding/       # Ollama client
│   ├── mcp/             # MCP server implementation
│   └── models/          # Data models
├── scripts/             # Build and test scripts
├── .github/workflows/   # CI/CD (build + release)
└── Dockerfile           # Container image

Go service using the official MCP SDK (mark3labs/mcp-go)
DuckDB with VSS extension for vector similarity search (HNSW indexing)
Ollama for local embedding generation (768-dimensional, nomic-embed-text)
stdio transport for MCP client integration

For a deeper dive into the architecture, see .

📄 License

MIT

📈 Design Principles

Writes never fail (if the database is up)
No LLM in the write path — embeddings only, and those are retryable
Episode log is source of truth — everything else is derived
Simple over clever — vector search covers 80% of use cases
Portable — single binary, single database file

🧪 Testing

The project includes comprehensive unit tests:

# Run all tests
just test

# Run with coverage
just test-coverage

🤝 Contributing

See CONTRIBUTING.md for development setup, code style, and how to submit pull requests.

Markdownify MCP

Markdownify is a multi-functional file conversion service that supports converting multiple formats such as PDFs, images, audio, and web page content into Markdown format.

A Python-based MCP Server that provides advanced to-do list management and content organization functions through the Notion API, enabling seamless integration between AI models and Notion.

Python

24.8K

4.5 points

Duckduckgo MCP Server

Certified

The DuckDuckGo Search MCP Server provides web search and content scraping services for LLMs such as Claude.

The GitLab MCP server is a project based on the Model Context Protocol that provides a comprehensive toolset for interacting with GitLab accounts, including code review, merge request management, CI/CD configuration, and other functions.

UnityMCP is a Unity editor plugin that implements the Model Context Protocol (MCP), providing seamless integration between Unity and AI assistants, including real - time state monitoring, remote command execution, and log functions.

38.4K

5 points

Figma Context MCP

Framelink Figma MCP Server is a server that provides access to Figma design data for AI programming tools (such as Cursor). By simplifying the Figma API response, it helps AI more accurately achieve one - click conversion from design to code.

A Gmail automatic authentication MCP server designed for Claude Desktop, supporting Gmail management through natural language interaction, including complete functions such as sending emails, label management, and batch operations.

The MiniMax Model Context Protocol (MCP) is an official server that supports interaction with powerful text-to-speech, video/image generation APIs, and is suitable for various client tools such as Claude Desktop and Cursor.

Python

55.3K

4.8 points

Zhiqi Future, Your AI Solution Think Tank

English 简体中文繁體中文にほんご

Engram

Overview

Installation

Content Details

Alternatives

What is Engram?

How to use Engram?

Use cases

Main features

How to use

Usage examples

Frequently Asked Questions

Related resources

Installation

🚀 Engram

🚀 Quick Start

Prerequisites

Install

Option A: Download a pre-built binary

Option B: Build from source

Pull the embedding model

Run

✨ Features

📚 Documentation

📦 Installation

💻 Usage Examples

MCP Client Integration

Quick Setup (Local)

Docker & Deployment

Quick Start (Development)

🔧 Technical Details

📄 License

📈 Design Principles

🧪 Testing

🤝 Contributing

Alternatives