Persistent memory for Claude Code, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenCode, and any MCP client. Built on the iii engine. 95.2% retrieval R@5. 92% fewer tokens. Zero external databases.
agentmemory works with any agent that supports hooks, MCP, or REST API. All agents share the same memory server. One server, memories shared across all of them.
Every coding agent forgets everything when the session ends. You waste the first 5 minutes of every session re-explaining your stack. agentmemory runs in the background and eliminates that entirely.
Every tool use recorded via hooks. Zero manual effort. Session 1 sets up JWT auth — Session 2 already knows your stack.
BM25 + vector + knowledge graph with RRF fusion. Search "database performance" and find "N+1 query fix" — keyword matching can't do that.
API keys, secrets, and private tags stripped before storage. Self-hosted by default. Local embeddings are free and offline.
Memory evolution, auto-forgetting, knowledge graphs, team memory, and more. Everything your agent needs to remember — without the overhead.
Every tool use recorded via hooks — zero manual effort. 12 hooks covering the full session lifecycle.
BM25 + vector + knowledge graph with RRF fusion. Find what you need, even when keywords don't match.
Versioning, supersession, and relationship graphs. Memories grow and adapt with your project.
TTL expiry, contradiction detection, importance eviction. Stale memories auto-evict. Frequently accessed ones strengthen.
API keys, secrets, and private tags stripped before storage. Self-hosted by default. No data leaves your machine.
Circuit breaker, provider fallback chain, and health monitoring. Your memory stays reliable.
Bi-directional sync with MEMORY.md. Built-in memory and agentmemory stay in sync automatically.
Entity extraction with BFS traversal. Understand relationships between code, files, and concepts.
Namespaced shared and private memory across team members. Collaborate without context loss.
Trace any memory back to source observations. Know exactly where every fact came from.
Version, rollback, and diff memory state. Never lose a memory you've captured.
Live observation stream, session explorer, memory browser, and knowledge graph viz on port 3113.
Memory pipeline with 4-tier consolidation, inspired by how human brains process memory.
Raw observations from tool use. Captured in real-time as your agent works.
Compressed session summaries. "What happened" during each session.
Extracted facts and patterns. "What I know" about your project and stack.
Workflows and decision patterns. "How to do it" — captured from your agent's behavior.
12 lifecycle hooks covering every moment of your agent's session.
| Hook | Captures |
|---|---|
| SessionStart | Project path, session ID |
| UserPromptSubmit | User prompts (privacy-filtered) |
| PreToolUse | File access patterns + enriched context |
| PostToolUse | Tool name, input, output |
| PostToolUseFailure | Error context |
| PreCompact | Re-injects memory before compaction |
| SubagentStart/Stop | Sub-agent lifecycle |
| Stop / SessionEnd | End-of-session summary and complete marker |
LongMemEval-S (ICLR 2025, 500 questions). 95.2% retrieval accuracy with local embeddings — free and offline.
| Retrieval Accuracy | R@5 | R@10 | MRR |
|---|---|---|---|
| agentmemory | 95.2% | 98.6% | 88.2% |
| BM25-only fallback | 86.2% | 94.6% | 71.5% |
| Token Savings | Approach | Tokens/Yr | Cost/Yr |
|---|---|---|---|
| Paste full context | — | 19.5M+ | Impossible |
| LLM-summarized | — | ~650K | ~$500 |
| agentmemory | — | ~170K | ~$10 |
| agentmemory + local embeddings | — | ~170K | $0 |
agentmemory vs mem0, Letta, and built-in memory. 95.2% retrieval R@5, zero external deps, multi-agent support.
| Feature | agentmemory | mem0 | Letta / MemGPT | Built-in (CLAUDE.md) |
|---|---|---|---|---|
| Type | Memory engine + MCP server | Memory layer API | Full agent runtime | Static file |
| Retrieval R@5 | 95.2% | 68.5% | 83.2% | N/A (grep) |
| Auto-capture | 12 hooks (zero manual effort) | Manual add() calls | Agent self-edits | Manual editing |
| Search | BM25 + Vector + Graph (RRF) | Vector + Graph | Vector (archival) | Loads everything |
| Multi-agent | MCP + REST + leases | API (no coordination) | Within Letta only | Per-agent files |
| External deps | None (SQLite + iii) | Qdrant / pgvector | Postgres + vector DB | None |
| Token efficiency | ~1,900 tokens/session ($10/yr) | Varies | Core memory in context | 22K+ at 240 obs |
| Real-time viewer | Yes (port 3113) | Cloud dashboard | Cloud dashboard | No |
| Self-hosted | Yes (default) | Optional | Optional | Yes |
Three signals fused with Reciprocal Rank Fusion. BM25, vector embeddings, and knowledge graph traversal working together.
Stemmed keyword matching with synonym expansion. Always on. Supports Greek, Cyrillic, Hebrew, Arabic, and CJK.
Cosine similarity over dense embeddings. 6 providers including local (free, offline). Switch or combine providers.
Graph traversal via entity matching. Understand relationships between concepts, code, and decisions.
The most comprehensive MCP memory toolkit for any agent. 53 tools, 6 resources, 3 prompts, and 4 skills.
Get agentmemory running in 30 seconds. One command, zero dependencies.
Install agentmemory once. The bare agentmemory command works everywhere.
Starts the memory server on port 3111. Auto-starts the iii engine and real-time viewer.
Seeds 3 realistic sessions and runs semantic searches. See it find "N+1 query fix" when you search "database performance".
Connect agentmemory to Claude Code, Cursor, Gemini CLI, or any MCP client.
Watch the memory build live. Session explorer, memory browser, knowledge graph visualization.
One MCP config block works across Cursor, Claude Desktop, Cline, Roo Code, Windsurf, Gemini CLI, and more.
The fastest answers to the questions people ask first.
agentmemory is the #1 persistent memory for AI coding agents. It silently captures everything your agent does, compresses it into searchable memory, and injects the right context when the next session starts. No more re-explaining your stack.
Claude Code, Cursor, Gemini CLI, Codex CLI, Hermes, OpenClaw, pi, OpenCode, Cline, Goose, Aider, Claude Desktop, Windsurf, Roo Code, Kilo Code, OpenHuman, and any MCP client. All agents share the same memory server.
Built-in memory (CLAUDE.md, Cursor notepads) caps at 200 lines and loads everything into context. agentmemory has unlimited scale, uses BM25 + vector + graph search with top-K retrieval, costs ~1,900 tokens per session instead of 22K+, and works across agents.
No. agentmemory uses SQLite + the iii engine. Zero external dependencies. No Postgres, no Redis, no Qdrant. Self-hosted by default.
With local embeddings (all-MiniLM-L6-v2), the embedding provider is free and fully offline. The total cost is approximately $10/year with cloud embeddings or $0/year with local embeddings.
Run npm install -g @agentmemory/agentmemory, then agentmemory to start the server. Or use npx @agentmemory/agentmemory for a no-install option.