⭐ 14.2k stars · #1 Memory for AI Coding Agents

Your coding agent remembers everything. No more re-explaining.

Persistent memory for Claude Code, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenCode, and any MCP client. Built on the iii engine. 95.2% retrieval R@5. 92% fewer tokens. Zero external databases.

95.2%
Retrieval R@5
92%
Fewer Tokens
53
MCP Tools
12
Auto Hooks
0
External DBs
~1,900
Tokens / Session
$10
Cost / Year
950+
Tests Passing
118
Source Files
34
KV Scopes

Works with every agent

agentmemory works with any agent that supports hooks, MCP, or REST API. All agents share the same memory server. One server, memories shared across all of them.

Claude Code
Codex CLI
Cursor
Gemini CLI
OpenClaw
Hermes
OpenCode
Cline
Goose
Aider
Claude Desktop
Windsurf
Roo Code
Kilo Code
pi
OpenHuman
+ any MCP client

Why agentmemory

Every coding agent forgets everything when the session ends. You waste the first 5 minutes of every session re-explaining your stack. agentmemory runs in the background and eliminates that entirely.

🧠

Automatic Capture

Every tool use recorded via hooks. Zero manual effort. Session 1 sets up JWT auth — Session 2 already knows your stack.

🔍

Semantic Search

BM25 + vector + knowledge graph with RRF fusion. Search "database performance" and find "N+1 query fix" — keyword matching can't do that.

🔒

Privacy First

API keys, secrets, and private tags stripped before storage. Self-hosted by default. Local embeddings are free and offline.

Key Features

Memory evolution, auto-forgetting, knowledge graphs, team memory, and more. Everything your agent needs to remember — without the overhead.

🔄

Automatic Capture

Every tool use recorded via hooks — zero manual effort. 12 hooks covering the full session lifecycle.

🎯

Semantic Search

BM25 + vector + knowledge graph with RRF fusion. Find what you need, even when keywords don't match.

📈

Memory Evolution

Versioning, supersession, and relationship graphs. Memories grow and adapt with your project.

Auto-Forgetting

TTL expiry, contradiction detection, importance eviction. Stale memories auto-evict. Frequently accessed ones strengthen.

🔐

Privacy First

API keys, secrets, and private tags stripped before storage. Self-hosted by default. No data leaves your machine.

🩺

Self-Healing

Circuit breaker, provider fallback chain, and health monitoring. Your memory stays reliable.

🔗

Claude Bridge

Bi-directional sync with MEMORY.md. Built-in memory and agentmemory stay in sync automatically.

🌐

Knowledge Graph

Entity extraction with BFS traversal. Understand relationships between code, files, and concepts.

👥

Team Memory

Namespaced shared and private memory across team members. Collaborate without context loss.

📋

Citation Provenance

Trace any memory back to source observations. Know exactly where every fact came from.

📸

Git Snapshots

Version, rollback, and diff memory state. Never lose a memory you've captured.

📺

Real-Time Viewer

Live observation stream, session explorer, memory browser, and knowledge graph viz on port 3113.

How It Works

Memory pipeline with 4-tier consolidation, inspired by how human brains process memory.

PostToolUse hook fires -> SHA-256 dedup (5min window) -> Privacy filter (strip secrets, API keys) -> Store raw observation -> LLM compress -> structured facts + concepts + narrative -> Vector embedding (6 providers + local) -> Index in BM25 + vector Stop / SessionEnd hook fires -> Summarize session -> Knowledge graph extraction -> Slot reflection SessionStart hook fires -> Load project profile (top concepts, files, patterns) -> Hybrid search (BM25 + vector + graph) -> Token budget (default: 2000 tokens) -> Inject into conversation

4-Tier Memory Consolidation

Working

Short-Term Memory

Raw observations from tool use. Captured in real-time as your agent works.

Episodic

Session Summaries

Compressed session summaries. "What happened" during each session.

Semantic

Extracted Facts

Extracted facts and patterns. "What I know" about your project and stack.

Procedural

Decision Patterns

Workflows and decision patterns. "How to do it" — captured from your agent's behavior.

Memories decay over time (Ebbinghaus curve). Frequently accessed memories strengthen. Stale memories auto-evict. Contradictions are detected and resolved.

What Gets Captured

12 lifecycle hooks covering every moment of your agent's session.

Hook Captures
SessionStart Project path, session ID
UserPromptSubmit User prompts (privacy-filtered)
PreToolUse File access patterns + enriched context
PostToolUse Tool name, input, output
PostToolUseFailure Error context
PreCompact Re-injects memory before compaction
SubagentStart/Stop Sub-agent lifecycle
Stop / SessionEnd End-of-session summary and complete marker

Benchmarks

LongMemEval-S (ICLR 2025, 500 questions). 95.2% retrieval accuracy with local embeddings — free and offline.

Retrieval Accuracy R@5 R@10 MRR
agentmemory 95.2% 98.6% 88.2%
BM25-only fallback 86.2% 94.6% 71.5%
Token Savings Approach Tokens/Yr Cost/Yr
Paste full context 19.5M+ Impossible
LLM-summarized ~650K ~$500
agentmemory ~170K ~$10
agentmemory + local embeddings ~170K $0
Embedding model: all-MiniLM-L6-v2 (local, free, no API key). Full reports: LongMemEval, QUALITY, SCALE — and competitor comparison vs mem0, Letta, Khoj, claude-mem, Hippo.

vs Competitors

agentmemory vs mem0, Letta, and built-in memory. 95.2% retrieval R@5, zero external deps, multi-agent support.

Feature agentmemory mem0 Letta / MemGPT Built-in (CLAUDE.md)
Type Memory engine + MCP server Memory layer API Full agent runtime Static file
Retrieval R@5 95.2% 68.5% 83.2% N/A (grep)
Auto-capture 12 hooks (zero manual effort) Manual add() calls Agent self-edits Manual editing
Search BM25 + Vector + Graph (RRF) Vector + Graph Vector (archival) Loads everything
Multi-agent MCP + REST + leases API (no coordination) Within Letta only Per-agent files
External deps None (SQLite + iii) Qdrant / pgvector Postgres + vector DB None
Token efficiency ~1,900 tokens/session ($10/yr) Varies Core memory in context 22K+ at 240 obs
Real-time viewer Yes (port 3113) Cloud dashboard Cloud dashboard No
Self-hosted Yes (default) Optional Optional Yes

Triple-Stream Search

Three signals fused with Reciprocal Rank Fusion. BM25, vector embeddings, and knowledge graph traversal working together.

Stream 1

BM25

Stemmed keyword matching with synonym expansion. Always on. Supports Greek, Cyrillic, Hebrew, Arabic, and CJK.

Stream 2

Vector

Cosine similarity over dense embeddings. 6 providers including local (free, offline). Switch or combine providers.

Stream 3

Knowledge Graph

Graph traversal via entity matching. Understand relationships between concepts, code, and decisions.

53 MCP Tools

The most comprehensive MCP memory toolkit for any agent. 53 tools, 6 resources, 3 prompts, and 4 skills.

memory_recall
Search past observations
memory_save
Save an insight, decision, or pattern
memory_smart_search
Hybrid semantic + keyword search
memory_sessions
List recent sessions
memory_profile
Project profile (concepts, files, patterns)
memory_export
Export all memory data
memory_relations
Query relationship graph
memory_consolidate
Run 4-tier consolidation
memory_audit
Audit trail of operations
memory_snapshot_create
Git-versioned snapshot
memory_team_share
Share with team members
memory_signal_send
Inter-agent messaging

Quick Start

Get agentmemory running in 30 seconds. One command, zero dependencies.

1

Install globally

Install agentmemory once. The bare agentmemory command works everywhere.

npm install -g @agentmemory/agentmemory
2

Start the server

Starts the memory server on port 3111. Auto-starts the iii engine and real-time viewer.

agentmemory
3

Seed sample data

Seeds 3 realistic sessions and runs semantic searches. See it find "N+1 query fix" when you search "database performance".

agentmemory demo
4

Wire your agent

Connect agentmemory to Claude Code, Cursor, Gemini CLI, or any MCP client.

agentmemory connect claude-code
5

Open the viewer

Watch the memory build live. Session explorer, memory browser, knowledge graph visualization.

# Open in your browser http://localhost:3113
Get Started →

Add to Any Agent

One MCP config block works across Cursor, Claude Desktop, Cline, Roo Code, Windsurf, Gemini CLI, and more.

// Add to your agent's mcpServers config { "mcpServers": { "agentmemory": { "command": "npx", "args": ["-y", "@agentmemory/mcp"], "env": { "AGENTMEMORY_URL": "http://localhost:3111" } } } }

Frequently Asked Questions

The fastest answers to the questions people ask first.

What is agentmemory?

agentmemory is the #1 persistent memory for AI coding agents. It silently captures everything your agent does, compresses it into searchable memory, and injects the right context when the next session starts. No more re-explaining your stack.

Which agents does it work with?

Claude Code, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenCode, Cline, Goose, Aider, Claude Desktop, Windsurf, Roo Code, Kilo Code, OpenHuman, and any MCP client. All agents share the same memory server.

How is it different from built-in agent memory?

Built-in memory (CLAUDE.md, Cursor notepads) caps at 200 lines and loads everything into context. agentmemory has unlimited scale, uses BM25 + vector + graph search with top-K retrieval, costs ~1,900 tokens per session instead of 22K+, and works across agents.

Does it require external databases?

No. agentmemory uses SQLite + the iii engine. Zero external dependencies. No Postgres, no Redis, no Qdrant. Self-hosted by default.

How much does it cost to run?

With local embeddings (all-MiniLM-L6-v2), the embedding provider is free and fully offline. The total cost is approximately $10/year with cloud embeddings or $0/year with local embeddings.

How do I install it?

Run npm install -g @agentmemory/agentmemory, then agentmemory to start the server. Or use npx @agentmemory/agentmemory for a no-install option.