MCPHub LabRegistrymcp-memory-service
doobidoo

mcp memory service

Built by doobidoo β€’ 1,558 stars

What is mcp memory service?

Open-source persistent memory for AI agent pipelines (LangGraph, CrewAI, AutoGen) and Claude. REST API + knowledge graph + autonomous consolidation.

How to use mcp memory service?

1. Install a compatible MCP client (like Claude Desktop). 2. Open your configuration settings. 3. Add mcp memory service using the following command: npx @modelcontextprotocol/mcp-memory-service 4. Restart the client and verify the new tools are active.
πŸ›‘οΈ Scoped (Restricted)
npx @modelcontextprotocol/mcp-memory-service --scope restricted
πŸ”“ Unrestricted Access
npx @modelcontextprotocol/mcp-memory-service

Key Features

Native MCP Protocol Support
Real-time Tool Activation & Execution
Verified High-performance Implementation
Secure Resource & Context Handling

Optimized Use Cases

Extending AI models with custom local capabilities
Automating system workflows via natural language
Connecting external data sources to LLM context windows

mcp memory service FAQ

Q

Is mcp memory service safe?

Yes, mcp memory service follows the standardized Model Context Protocol security patterns and only executes tools with explicit user-granted permissions.

Q

Is mcp memory service up to date?

mcp memory service is currently active in the registry with 1,558 stars on GitHub, indicating its reliability and community support.

Q

Are there any limits for mcp memory service?

Usage limits depend on the specific implementation of the MCP server and your system resources. Refer to the official documentation below for technical details.

Official Documentation

View on GitHub

mcp-memory-service

Persistent Shared Memory for AI Agent Pipelines

Open-source memory backend for AI agents β€” REST API, MCP, OAuth, CLI, dashboard. One self-hosted service, every transport. Agents store decisions, share causal knowledge graphs, and retrieve context in 5ms β€” without cloud lock-in or API costs.

Works with LangGraph Β· CrewAI Β· AutoGen Β· any HTTP client Β· Claude Desktop Β· OpenCode


License: Apache 2.0 PyPI version Python GitHub stars Works with LangGraph Works with CrewAI Works with AutoGen Works with Claude Works with Cursor Remote MCP claude.ai Browser Compatible OAuth 2.0 Sponsor


🎬 See It in Action

Watch the Dashboard Walkthrough

Watch the Web Dashboard Walkthrough on YouTube β€” Semantic search, tag browser, document ingestion, analytics, quality scoring, and API docs in under 2 minutes.


🌐 Works with claude.ai (Browser)

Unlike desktop-only MCP servers, mcp-memory-service supports Remote MCP for native claude.ai integration.

What this means:

  • βœ… Use persistent memory directly in your browser (no Claude Desktop required)
  • βœ… Works on any device (laptop, tablet, phone)
  • βœ… Enterprise-ready (OAuth 2.0 + HTTPS + CORS)
  • βœ… Self-hosted OR cloud-hosted (your choice)

5-Minute Setup:

# 1. Start server with Remote MCP enabled
MCP_STREAMABLE_HTTP_MODE=1 \
MCP_SSE_HOST=0.0.0.0 \
MCP_SSE_PORT=8765 \
MCP_OAUTH_ENABLED=true \
python -m mcp_memory_service.server

# 2. Expose via Cloudflare Tunnel (or your own HTTPS setup)
cloudflared tunnel --url http://localhost:8765
# β†’ Outputs: https://random-name.trycloudflare.com

# 3. In claude.ai: Settings β†’ Connectors β†’ Add Connector
# Paste the URL: https://random-name.trycloudflare.com/mcp
# OAuth flow will handle authentication automatically

Production Setup: See Remote MCP Setup Guide for Let's Encrypt, nginx, and firewall configuration. Step-by-Step Tutorial: Blog: 5-Minute claude.ai Setup | Wiki Guide


Why Agents Need This

Without mcp-memory-serviceWith mcp-memory-service
Each agent run starts from zeroAgents retrieve prior decisions in 5ms
Memory is local to one graph/runMemory is shared across all agents and runs
You manage Redis + Pinecone + glue codeOne self-hosted service, zero cloud cost
No causal relationships between factsKnowledge graph with typed edges (causes, fixes, contradicts)
Context window limits create amnesiaAutonomous consolidation compresses old memories

Key capabilities for agent pipelines:

  • Framework-agnostic REST API β€” 76 endpoints, no MCP client library needed
  • Knowledge graph β€” agents share causal chains, not just facts
  • X-Agent-ID header β€” auto-tag memories by agent identity for scoped retrieval
  • conversation_id β€” bypass deduplication for incremental conversation storage
  • SSE events β€” real-time notifications when any agent stores or deletes a memory
  • Embeddings run locally via ONNX β€” memory never leaves your infrastructure

Agent Quick Start

pip install mcp-memory-service
MCP_ALLOW_ANONYMOUS_ACCESS=true memory server --http
# REST API running at http://localhost:8000
import httpx

BASE_URL = "http://localhost:8000"

# Store β€” auto-tag with X-Agent-ID header
async with httpx.AsyncClient() as client:
    await client.post(f"{BASE_URL}/api/memories", json={
        "content": "API rate limit is 100 req/min",
        "tags": ["api", "limits"],
    }, headers={"X-Agent-ID": "researcher"})
    # Stored with tags: ["api", "limits", "agent:researcher"]

# Search β€” scope to a specific agent
    results = await client.post(f"{BASE_URL}/api/memories/search", json={
        "query": "API rate limits",
        "tags": ["agent:researcher"],
    })
    print(results.json()["memories"])

Framework-specific guides: docs/agents/

Real-World: Multi-Agent Cluster with Shared Memory

"After I work with one of the cluster agents on something I want my local agent to know about, the cluster agent adds a special tag to the memory entry that my local agent recognizes as a message from a cluster agent. So they end up using it as a comms bridge β€” and it's pretty delightful." β€” @jeremykoerber, issue #591

A 5-agent openclaw cluster uses mcp-memory-service as shared state and as an inter-agent messaging bus β€” without any custom protocol. Cluster agents tag memories with a sentinel like msg:cluster, and the local agent filters on that tag to receive cross-cluster signals. The memory service becomes the coordination layer with zero additional infrastructure.

# Cluster agent stores a learning and flags it for the local agent
await client.post(f"{BASE_URL}/api/memories", json={
    "content": "Rate limit on provider X is 50 RPM β€” switch to provider Y after 40",
    "tags": ["api", "limits", "msg:cluster"],       # sentinel tag
}, headers={"X-Agent-ID": "cluster-agent-3"})

# Local agent polls for cluster messages
results = await client.post(f"{BASE_URL}/api/memories/search", json={
    "query": "messages from cluster",
    "tags": ["msg:cluster"],
})

This pattern β€” tags as inter-agent signals β€” emerges naturally from the tagging system and requires no additional infrastructure.

Real-World: Self-Hosted Docker Stack with Cloudflare Tunnel

"The quality of life that session-independent memory adds to AI workflows is immense. File-based memory demands constant discipline. Semantic recall from a live database doesn't. Storing data on my own hardware while making it remotely accessible across platforms turned out to be a feature I didn't know I needed." β€” @PL-Peter, discussion #602

A production-tested self-hosted deployment using Docker containers behind a Cloudflare tunnel, with AuthMCP Gateway handling authentication:

LayerRole
Cloudflare TunnelName-based routing, subnet-based access control, authentication before hitting self-hosted resources
AuthMCP GatewayAuth/aggregation with locally managed users, admin UI, per-user MCP server access control, bearer token auth
mcp-memory-serviceTwo Docker containers sharing one SQLite backend β€” one for MCP, one for the web UI (document ingestion)

Security best practices for this setup:

  • Use Cloudflare ZeroTrust with subnet-based access control (e.g., allow Anthropic subnets + your own IPs)
  • Add Client IP Address Filtering to all Cloudflare API tokens (Dashboard β†’ My Profile β†’ API Tokens β†’ Edit β†’ Client IP Address Filtering) to limit abuse if a token leaks
  • If using IPv6, include your IPv6 /64 network in the allowlist (Python prefers IPv6 by default)
  • For long-running browser sessions, request the offline_access scope during authorization to receive a rotating refresh_token (lifetime via MCP_OAUTH_REFRESH_TOKEN_EXPIRE_DAYS, default 30 days). Without this scope, access tokens are the only credential β€” extend MCP_OAUTH_ACCESS_TOKEN_EXPIRE_MINUTES up to 1440 (24h) if you need longer single-shot sessions.
  • Consider an auth proxy like AuthMCP or mcp-auth-proxy for robust session management

Comparison with Alternatives

vs. Commercial Memory APIs

Mem0ZepDIY Redis+Pineconemcp-memory-service
LicenseProprietaryEnterpriseβ€”Apache 2.0
CostPer-call APIEnterpriseInfra costs$0
🌐 claude.ai Browser❌ Desktop only❌ Desktop onlyβŒβœ… Remote MCP
OAuth 2.0 + DCR❓ Unknown❓ UnknownβŒβœ… Enterprise-ready
Streamable HTTPβŒβŒβŒβœ… (SSE also supported)
Framework integrationSDKSDKManualREST API (any HTTP client)
Knowledge graphNoLimitedNoYes (typed edges)
Auto consolidationNoNoNoYes (decay + compression)
On-premise embeddingsNoNoManualYes (ONNX, local)
PrivacyCloudCloudPartial100% local
Hybrid searchNoYesManualYes (BM25 + vector)
MCP protocolNoNoNoYes
REST APIYesYesManualYes (76 endpoints)

vs. MCP-Native Alternatives

MemPalace is an MCP-native alternative that went viral in April 2026 with strong LongMemEval claims. A community code review (Issue #27) subsequently showed that the headline numbers reflect the underlying vector store rather than the advertised Palace architecture, and the maintainers acknowledged most points. We keep the comparison here for transparency, but readers should interpret the scores with that context in mind.

MemPalacemcp-memory-service
LongMemEval R@5 (raw ChromaDB, zero LLM)96.6%ΒΉ86.0% (session) / 80.4% (turn)
LongMemEval R@5 (with reranking)100%Β²β€”
Storage granularitySession-levelTurn-level + session-level
Team / multi-device sync❌ Local onlyβœ… Cloudflare sync
REST API / Web dashboardβŒβœ…
OAuth 2.1 + multi-userβŒβœ…
Knowledge graphβŒβœ… (typed edges)
Auto consolidationβŒβœ… (decay + compression)
Compatible AI toolsClaude-focused25+ tools
LicenseMITApache 2.0

Why the benchmark gap? Two independent factors:

  1. Ingestion granularity. MemPalace stores each conversation as a single unit (session-level). LongMemEval asks "which session contains the answer?" β€” a question that session-level storage answers structurally. mcp-memory-service defaults to turn-level storage (one entry per message), which enables fine-grained retrieval ("what exactly did the user say about X?") but spreads a session's signal across many entries. Using memory_store_session (added in v10.35.0) brings our score to 86.0% R@5.
  2. What the 96.6% actually measures. Per Issue #27, MemPalace's headline number is produced in "raw mode" β€” plain text stored in ChromaDB with default embeddings. The Palace architecture (Wings, Rooms, Halls) is not active in that configuration; "Halls" exist only as metadata strings with no effect on ranking. The 96.6% is therefore a ChromaDB + default-embedding baseline, not a measurement of MemPalace's structural retrieval features. A direct "apples-to-apples" architectural comparison is not possible with the published numbers.

ΒΉ Measured in MemPalace "raw mode" (plain text in ChromaDB with default embeddings). Per Issue #27, the Palace structural features are bypassed in this configuration.

Β² 100% result uses optional LLM reranking (~500 API calls) on a partially tuned test set. Clean held-out score (as reported by the maintainers): 98.4% R@5.


Stop Re-Explaining Your Project to AI Every Session

<p align="center"> <img width="240" alt="MCP Memory Service" src="https://github.com/user-attachments/assets/eab1f341-ca54-445c-905e-273cd9e89555" /> </p>

Your AI assistant forgets everything when you start a new chat. After 50 tool uses, context explodes to 500k+ tokensβ€”Claude slows down, you restart, and now it remembers nothing. You spend 10 minutes re-explaining your architecture. Again.

MCP Memory Service solves this.

It automatically captures your project context, architecture decisions, and code patterns. When you start fresh sessions, your AI already knows everythingβ€”no re-explaining, no context loss, no wasted time.

πŸŽ₯ 2-Minute Video Demo

<div align="center"> <a href="https://www.youtube.com/watch?v=veJME5qVu-A"> <img src="https://img.youtube.com/vi/veJME5qVu-A/maxresdefault.jpg" alt="MCP Memory Service Demo" width="700"> </a> <p><em>Technical showcase: Performance, Architecture, AI/ML Intelligence & Developer Experience</em></p> </div>

⚑ Works With Your Favorite AI Tools

πŸ€– Agent Frameworks (REST API)

LangGraph Β· CrewAI Β· AutoGen Β· Any HTTP Client Β· OpenClaw/Nanobot Β· Custom Pipelines

πŸ–₯️ CLI & Terminal AI (MCP)

Claude Code Β· Gemini CLI Β· Gemini Code Assist Β· OpenCode Β· Codex CLI Β· Goose Β· Aider Β· GitHub Copilot CLI Β· Amp Β· Continue Β· Zed Β· Cody

🎨 Desktop & IDE (MCP)

Claude Desktop Β· VS Code Β· Cursor Β· Windsurf Β· Kilo Code Β· Raycast Β· JetBrains Β· Replit Β· Sourcegraph Β· Qodo

πŸ’¬ Chat Interfaces (MCP)

ChatGPT (Developer Mode) Β· claude.ai (Remote MCP via HTTPS)

Works seamlessly with any MCP-compatible client or HTTP client - whether you're building agent pipelines, coding in the terminal, IDE, or browser.

πŸ’‘ NEW: ChatGPT now supports MCP! Enable Developer Mode to connect your memory service directly. See setup guide β†’


πŸš€ Get Started in 60 Seconds

Not sure which setup fits your needs? See the Setup Guide β€” a decision tree walks you to the right path in under a minute.

1. Install:

pip install mcp-memory-service

2. Configure your AI client:

<details open> <summary><strong>Claude Desktop</strong></summary>

Add to your config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "memory": {
      "command": "memory",
      "args": ["server"]
    }
  }
}

Restart Claude Desktop. Your AI now remembers everything across sessions.

</details> <details> <summary><strong>Claude Code</strong></summary>
claude mcp add memory -- memory server

Restart Claude Code. Memory tools will appear automatically.

</details> <details> <summary><strong>OpenCode</strong></summary>

Start the HTTP API:

MCP_ALLOW_ANONYMOUS_ACCESS=true memory server --http

Install the local plugin:

git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
mkdir -p ~/.config/opencode/plugins
cp opencode/memory-plugin.js ~/.config/opencode/plugins/
cp opencode/memory-plugin.config.example.json ~/.config/opencode/memory-plugin.json

OpenCode automatically loads local plugins from ~/.config/opencode/plugins/ and .opencode/plugins/.

See OpenCode integration guide for configuration, project-local installs, and current limitations.

The current OpenCode integration ships as repository files for the local plugin directory. If you installed only the PyPI package, clone the repository once to copy the plugin files.

The plugin defaults to http://127.0.0.1:8000, but memoryService.endpoint and OPENCODE_MEMORY_ENDPOINT let you target any reachable HTTP deployment.

</details> <details> <summary><strong>🌐 claude.ai (Browser β€” Remote MCP)</strong></summary>

No local installation required on the client β€” works directly in your browser:

# 1. Start server with Remote MCP
MCP_STREAMABLE_HTTP_MODE=1 \
MCP_SSE_HOST=0.0.0.0 \
MCP_OAUTH_ENABLED=true \
python -m mcp_memory_service.server

# 2. Expose publicly (Cloudflare Tunnel)
cloudflared tunnel --url http://localhost:8765

# 3. Add connector in claude.ai Settings β†’ Connectors with the tunnel URL

See Remote MCP Setup Guide for production deployment with Let's Encrypt, nginx, and Docker.

</details> <details> <summary><strong>πŸ”§ Advanced: Custom Backends & Team Setup</strong></summary>

For production deployments, team collaboration, or cloud sync:

git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
python scripts/installation/install.py

Choose from:

  • SQLite (local, fast, single-user)
  • Cloudflare (cloud, multi-device sync)
  • Hybrid (best of both: 5ms local + background cloud sync)
  • Milvus (dedicated vector DB β€” Milvus Lite file, self-hosted, or Zilliz Cloud)

ℹ️ For long-lived services (MCP servers, web backends, notebook sessions), prefer Docker Milvus or Zilliz Cloud over Milvus Lite. See docs/milvus-backend.md for why.

</details>

πŸ› οΈ CLI Server Lifecycle Commands

In addition to memory server --http (foreground mode), the CLI now includes server lifecycle commands for background HTTP management:

# Start HTTP server in background (default host=127.0.0.1, port=8000)
memory launch

# Start on a custom port
memory launch --port 8192

# Check status and health
memory info --port 8192
memory health --port 8192

# View recent logs and stop server
memory logs --lines 50
memory stop --port 8192

These commands are optimized for fast startup and avoid loading heavy ML dependencies unless needed.

⚠️ Security Note: By default, the server binds to 127.0.0.1 (localhost only). To expose the server on your network or allow remote access, you can use --host 0.0.0.0 or set MCP_HTTP_HOST=0.0.0.0. However, this exposes the API to your network and should be done only in trusted environments with proper authentication and firewall rules in production. For untrusted networks, use TLS termination (reverse proxy with HTTPS) or VPN overlays.


πŸ’‘ Why You Need This

The Problem

Session 1Session 2 (Fresh Start)
You: "We're building a Next.js app with Prisma and tRPC"AI: "What's your tech stack?" ❌
AI: "Got it, I see you're using App Router"You: Explains architecture again for 10 minutes 😀
You: "Add authentication with NextAuth"AI: "Should I use Pages Router or App Router?" ❌

The Solution

Session 1Session 2 (Fresh Start)
You: "We're building a Next.js app with Prisma and tRPC"AI: "I rememberβ€”Next.js App Router with Prisma and tRPC. What should we build?" βœ…
AI: "Got it, I see you're using App Router"You: "Add OAuth login"
You: "Add authentication with NextAuth"AI: "I'll integrate NextAuth with your existing Prisma setup." βœ…

Result: Zero re-explaining. Zero context loss. Just continuous, intelligent collaboration.


🌐 SHODH Ecosystem Compatibility

MCP Memory Service is fully compatible with the SHODH Unified Memory API Specification v1.0.0, enabling seamless interoperability across the SHODH ecosystem.

Compatible Implementations

ImplementationBackendEmbeddingsUse Case
shodh-memoryRocksDBMiniLM-L6-v2 (ONNX)Reference implementation
shodh-cloudflareCloudflare Workers + VectorizeWorkers AI (bge-small)Edge deployment, multi-device sync
mcp-memory-service (this)SQLite-vec / HybridMiniLM-L6-v2 (ONNX)Desktop AI assistants (MCP)

Unified Schema Support

All SHODH implementations share the same memory schema:

  • βœ… Emotional Metadata: emotion, emotional_valence, emotional_arousal
  • βœ… Episodic Memory: episode_id, sequence_number, preceding_memory_id
  • βœ… Source Tracking: source_type, credibility
  • βœ… Quality Scoring: quality_score, access_count, last_accessed_at

Interoperability Example: Export memories from mcp-memory-service β†’ Import to shodh-cloudflare β†’ Sync across devices β†’ Full fidelity preservation of emotional_valence, episode_id, and all spec fields.


✨ Quick Start Features

🧠 Persistent Memory – Context survives across sessions with semantic search πŸ” Smart Retrieval – Finds relevant context automatically using AI embeddings ⚑ 5ms Speed – Instant context injection, no latency πŸ”„ Multi-Client – Works across 25+ AI applications ☁️ Cloud Sync – Optional Cloudflare backend for team collaboration πŸ”’ Privacy-First – Local-first, you control your data πŸ“Š Web Dashboard – Visualize and manage memories at http://localhost:8000 🧬 Knowledge Graph – Interactive D3.js visualization of memory relationships 🏠 Homelab Quality Scoring – Point scoring at any OpenAI-compatible endpoint (Ollama, LiteLLM, vLLM) πŸ”— Entity Extraction – Auto-links @mentions, #tags, URLs, and file paths from memory content to a queryable entity graph πŸ’‘ Insight Cards – Consolidation detects patterns, trends, and knowledge gaps across your memory corpus and surfaces them as structured insights 🏷️ Tag Match Filtering – tag_match=AND/OR on memory_search for precise multi-tag queries

Homelab / self-hosted quality scoring (v10.45.0+): set MCP_QUALITY_AI_PROVIDER=openai-compatible to score memories with your local LLM instead of ONNX or a cloud API:

MCP_QUALITY_AI_PROVIDER=openai-compatible
MCP_QUALITY_AI_BASE_URL=http://localhost:11434/v1   # Ollama
MCP_QUALITY_AI_MODEL=qwen2.5:7b-instruct
# MCP_QUALITY_AI_API_KEY=ollama                     # optional

Recommended models: qwen2.5:7b-instruct (Ollama), mlx-community/Qwen2.5-7B-Instruct-4bit (MLX), or any instruct model via LiteLLM proxy. On endpoint failure, scoring falls back to implicit signals automatically.

Docker :quality-cpu tag β€” for users who want the built-in local ONNX quality scoring (ms-marco-MiniLM-L-6-v2 and nvidia-quality-classifier-deberta) without managing the one-time ONNX export themselves, and without shipping torch/transformers in their container:

docker pull doobidoo/mcp-memory-service:quality-cpu

The :quality-cpu image pre-exports both models at build time and ships only onnxruntime at runtime β€” no PyTorch dependency at deploy time. See tools/docker/README.md for details.

πŸ–₯️ Dashboard Preview

<p align="center"> <img src="https://raw.githubusercontent.com/wiki/doobidoo/mcp-memory-service/images/dashboard/mcp-memory-dashboard-v9.3.0-tour.gif" alt="MCP Memory Dashboard Tour" width="800"/> </p>

8 Dashboard Tabs: Dashboard β€’ Search β€’ Browse β€’ Documents β€’ Manage β€’ Analytics β€’ Quality β€’ API Docs

πŸ“– See Web Dashboard Guide for complete documentation.


Latest Release: v10.59.2 (May 17, 2026)

OAuth redirect_uri AnyUrl fix β€” IDE schemes now actually work

What's New:

  • fix(oauth): redirect_uri fields in AuthorizationRequest and TokenRequest changed from Optional[HttpUrl] to Optional[AnyUrl] β€” HttpUrl silently rejected cursor://, vscode://, vscode-insiders:// before reaching the scheme whitelist, making the v10.59.0 IDE scheme feature a no-op in practice. 8 regression tests added (#942, reported by @tkislan).

Previous Releases:

  • v10.59.1 - fix(oauth): reflect state parameter verbatim per RFC 6749 Β§4.1.2, fixes Cursor OAuth (#944, @tkislan)
  • v10.59.0 - feat(oauth): PEM key files + IDE redirect URI schemes; fix(hooks): symmetric project-affinity (PRs #926, #942, #941)
  • v10.58.0 - feat(insights): configurable exclusion, automated-type heuristic, acknowledgement flow (PR #939); feat(harvest): locale YAML plugins (PR #935, @filhocf); feat(plugin): smart-tagger example (PR #932, @filhocf)
  • v10.57.3 - feat(milvus): last_accessed tracking via _access side-collection (PR #925, @henry201605)
  • v10.57.2 - fix(deps): pin pymilvus<3.0.0 to restore Milvus Docker CI (PR #921)
  • v10.57.1 - fix(sqlite): LIKE ESCAPE tag matching + fix(milvus): preserve_timestamps value comparison (PRs #916, #918)
  • v10.57.0 - feat(memory_list): tag_match AND/OR filtering + feat(session): automatic chunking at turn boundaries (PRs #904, #912, @filhocf)
  • v10.56.3 - feat(milvus): get_memory_connections() via graph collection + fix(quality): MAINTAIN_SCAN_LIMIT fallback hardening
  • v10.56.2 - fix(milvus): missing stale_days param in count_all_memories + fix(quality): graceful MAINTAIN_SCAN_LIMIT fallback
  • v10.56.1 - fix(session): pass session_id as conversation_id to bypass semantic dedup
  • v10.56.0 - feat(consolidation): configurable maintain scan limit + InsightGenerator gap filter
  • v10.55.2 - fix(insights): handle None memory_type and tags in InsightGenerator sort
  • v10.55.1 - fix(entities): entity links always 0 in maintain Step 5 due to wrong graph accessor (PR #895)
  • v10.55.0 - feat(reasoning+consolidation): entity extraction, memory-entity linking, and Insight Cards (PRs #868, #869, @filhocf)
  • v10.54.0 - feat(search): tag_match parameter for memory_search AND/OR tag filtering (PR #890, @filhocf)
  • v10.53.0 - feat(milvus): activate consolidation embedding hydration end-to-end; security: GitPython 3.1.50 (PRs #885, #886, @henry201605)
  • v10.52.0 - feat(search): cascading fallback when semantic results are sparse; refactor(storage): include_embeddings on bulk-read ABC methods (PRs #883, #881, @filhocf, @henry201605)
  • v10.51.3 - feat(memory_update): versioned flag; feat(memory_graph): infer_transitive and suggest_relationships (PRs #865, #866, @filhocf)
  • v10.51.2 - fix(oauth): CORS preflight failures and missing resource_metadata; refactor(milvus): opt-in embedding hydration on read paths (PRs #877, #878)
  • v10.51.1 - fix(milvus): add delete_memory proxy for consolidation protocol (PR #872, @henry201605)
  • v10.51.0 - feat(plugins): live plugin hooks + dynamic type dropdowns + audit-log example (PRs #863, #864, #867, @filhocf)
  • v10.50.0 - feat(plugins): plugin hook scaffolding β€” on_store, on_delete, on_retrieve, on_consolidate (PR #856, @filhocf)
  • v10.49.4 - fix(consolidation): protect high-value mistake notes from decay/forgetting (PR #854, @filhocf)
  • v10.49.3 - fix(opencode): correct API path, payload field, and client-side tag filter (PRs #849, #850)
  • v10.49.2 - fix(ontology): register custom base types with empty subtype lists (PR #846)
  • v10.49.1 - fix: surface memory_type ontology coercion warnings + uvx CI flake fix (PR #844)
  • v10.49.0 - feat(cli): lazy lifecycle commands and faster startup (PR #841, @creativelaides)
  • v10.48.0 - feat: include_superseded retrieval filter + auto-mark on contradiction (PR #814, @filhocf)
  • v10.47.2 - fix(consolidation): disable-by-default schedule prevents unintended automatic consolidation (PR #821, closes #808)
  • v10.47.1 - fix(web): surface /server/update failures end-to-end (PR #807, closes #729)
  • v10.47.0 - feat: memory_quality maintain orchestrator + Docker DeBERTa quantization (PRs #802, #803, @filhocf, closes #799, #793)
  • v10.46.0 - feat: stale_days filter for memory_list β€” dormant memory detection (PR #796, @filhocf, closes #784)
  • v10.45.1 - fix: CodeQL redundant import cleanup + soft-delete regression tests (PRs #794, #795, @filhocf)
  • v10.45.0 - feat(quality): OpenAI-compatible provider for LiteLLM/Ollama/MLX + soft-delete UPDATE guards (PRs #790, #783, @filhocf)
  • v10.44.0 - feat: Mistake Notes β€” structured error replay (mistake_note_add, mistake_note_search, PR #786, @filhocf)
  • v10.43.0 - feat(search): Reciprocal Rank Fusion (RRF) for SQLite-vec hybrid search (PR #773, @filhocf)
  • v10.42.1 - fix(milvus): add missing anns_field to search calls for BM25-enabled collections (PR #775, @henry201605)
  • v10.42.0 - feat(milvus): MilvusGraphStorage, BM25 hybrid search, and consolidation integration (PR #762, @henry201605)
  • v10.41.0 - feat(oauth): OAuth 2.1 refresh_token grant with rotation, memory_graph on streamable-http (PRs #766, #759)
  • v10.40.4 - fix(quality): handle shape (1, 1) cross-encoder logits in ONNX ranker (PR #765)
  • v10.40.3 - fix(claude-hooks): eliminate socket hang-up and raise hook timeout (PR #761)
  • v10.40.2 - fix(docker): correct invalid Python one-liner in ONNX pre-download (PR #757)
  • v10.40.1 - fix(sync): CF hybrid sync reliability + reporting accuracy (PRs #751, #753)
  • v10.40.0 - feat: Milvus storage backend (Lite / self-hosted / Zilliz Cloud), OAuth XSS hardening, plugin shape validation (PRs #721, #745, #740)
  • v10.39.1 - hotfix: plugin.json author field object format β€” unblocks /plugin install mcp-memory-service (#738, #739)
  • v10.39.0 - feat: Claude Code plugin install (/plugin marketplace add doobidoo/mcp-memory-service) + MemoryClient.storeMemory() protocol-native writes (PRs #736, #735)
  • v10.38.4 - fix(mcp): return HTTP 202 for JSON-RPC notifications β€” fixes Codex/strict-client handshake (PR #733)
  • v10.38.3 - fix: Server tab auto-check, list_memories total_pages, knowledge graph edge rendering (PRs #728, #731, #730)
  • v10.38.2 - fix(windows): PS 7+ cert bypass, per-call SkipCertificateCheck, chicken-egg lib sourcing (PR #723)
  • v10.38.1 - fix: OAuth loopback ports (RFC 8252), CLI ingestion NameError, SSE CLI flags, Docker CI bumps (PRs #697, #704, #705, #707-709)
  • v10.38.0 - feat: opt-in Claude Code SessionEnd auto-harvest hook β€” safe-by-default, zero npm deps, 5s timeout, TLS opt-in (PR #711, 1,547 tests)
  • v10.37.0 - feat: POST /api/harvest HTTP endpoint for Session Harvest + CodeQL path-injection hardening (PR #710, 1,547 tests)
  • v10.36.8 - fix: event-loop blocking paths in SqliteVecMemoryStorage.initialize() β€” pragma application and hash-embedding fallback now run in worker thread under _conn_lock (PR #700, 1,537 tests)
  • v10.36.7 - security: bump pygments to 2.20.0 (CVE-2026-4539/GHSA-5239-wwwm-4pmq) β€” ReDoS fix via rich transitive dep (PR #698, 1,537 tests)
  • v10.36.6 - security: bump cryptography to 46.0.7 (CVE-2026-39892) β€” buffer overflow fix in non-contiguous buffer handling (PR #690, 1,537 tests)
  • v10.36.5 - fix: Cloudflare Vectorize API v1 to v2 + test script fixes β€” fixed error 1010 "incorrect_api_version", content_hash arg, sys.path correction (PR #689, @mychaelgo, 1,537 tests)
  • v10.36.4 - fix(windows): hotfix for Get-McpApiKey returning first char instead of full API key β€” PowerShell array-enumeration trap fixed (PR #687, 1,537 tests)

Full version history: CHANGELOG.md | Older versions (v10.36.3 and earlier) | All Releases


πŸ“Š Retrieval Benchmarks

Three benchmarks measure retrieval quality (all-MiniLM-L6-v2, 384d embeddings, zero LLM API calls):

LongMemEval (500 questions, ~45–62 distractor sessions per question):

Question TypeR@5R@10NDCG@10MRR
Overall80.4%90.4%82.2%89.1%
single-session-assistant100.0%100.0%99.3%99.1%
knowledge-update84.6%96.8%86.2%95.5%
single-session-user91.4%92.9%86.0%83.8%
temporal-reasoning72.0%84.1%75.1%85.7%
multi-session70.7%86.0%77.6%89.4%

DevBench (practical developer workflow queries):

CategoryRecall@5MRR
Overall91.1%0.861
exact100%1.000
semantic80.0%0.700
cross-type90.0%0.867

LoCoMo (ACL 2024 long-term conversational memory):

CategoryRecall@5MRR
Overall49.7%0.414
multi-hop72.0%0.600
temporal33.5%0.274

Run benchmarks: python scripts/benchmarks/benchmark_longmemeval.py, python scripts/benchmarks/benchmark_devbench.py, python scripts/benchmarks/benchmark_locomo.py


<details> <summary><strong>Migration to v9.0.0</strong> (upgrading from v8.x)</summary>

⚑ TL;DR: No manual migration needed - upgrades happen automatically!

Breaking Changes:

  • Memory Type Ontology: Legacy types auto-migrate to new taxonomy (taskβ†’observation, noteβ†’observation)
  • Asymmetric Relationships: Directed edges only (no longer bidirectional)

Migration Process:

  1. Stop your MCP server
  2. Update to latest version (git pull or pip install --upgrade mcp-memory-service)
  3. Restart server - automatic migrations run on startup:
    • Database schema migrations (009, 010)
    • Memory type soft-validation (legacy types β†’ observation)
    • No tag migration needed (backward compatible)

Safety: Migrations are idempotent and safe to re-run

Breaking Change 1: Memory Type Ontology

  • Legacy memory types (task, note, standard) are deprecated
  • New formal taxonomy: 5 base types (observation, decision, learning, error, pattern) with 21 subtypes
  • Migration is automatic on server restart β€” no manual action required

Breaking Change 2: Asymmetric Relationships

  • Asymmetric relationships (causes, fixes, supports, follows) now store only directed edges
  • Symmetric relationships (related, contradicts) continue storing bidirectional edges
  • Database migration (010) runs automatically on startup

If your code expects bidirectional storage for asymmetric relationships:

# OLD behavior (no longer applies):
result = storage.find_connected(memory_id, relationship_type="causes")

# NEW: use direction parameter explicitly
result = storage.find_connected(
    memory_id,
    relationship_type="causes",
    direction="both"
)

If you encounter issues: Troubleshooting Guide Β· CHANGELOG.md Β· Open an issue

</details>

πŸ“š Documentation & Resources


🀝 Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Quick Development Setup:

git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
pip install -e .  # Editable install
pytest tests/      # Run test suite

Global Ranking

-
Trust ScoreMCPHub Index

Based on codebase health & activity.

Manual Config

{ "mcpServers": { "mcp-memory-service": { "command": "npx", "args": ["mcp-memory-service"] } } }