GitHub - thedotmack/claude-mem: A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
Service

GitHub - thedotmack/claude-mem: A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.

thedotmack
2026.02.02
ยทGitHubยทby web-ghost
#AI#Claude#Plugin#Memory#Context

Key Points

  • 1Claude-Mem is a Claude Code plugin that provides persistent memory by automatically capturing, compressing, and injecting relevant coding session context into future interactions.
  • 2This system ensures continuity of knowledge across sessions, offering features like token-efficient skill-based search, a web viewer for real-time memory streams, and fine-grained privacy controls.
  • 3It functions through lifecycle hooks, a worker service, SQLite and Chroma databases for hybrid search, and a 3-layer workflow to efficiently retrieve past observations and summaries.

claude-mem is a Claude Code plugin designed to provide persistent memory and context management for AI coding sessions, addressing the challenge of context loss across sessions. It automatically captures all activities performed by Claude during coding, compresses this information using AI (specifically Claude's agent-sdk to generate semantic summaries), and then injects relevant context back into future sessions. This allows Claude to maintain a continuous understanding of ongoing projects, even after sessions terminate or reconnect.

The core methodology revolves around a robust architecture comprising several interconnected components:

  1. Lifecycle Hooks: Six critical JavaScript hook scripts orchestrate the memory capture process: SessionStart, UserPromptSubmit, PostToolUse, Stop, SessionEnd, and a Smart Install pre-hook for dependency checking. These hooks intercept Claude's actions at specific points, triggering data capture and processing.
  2. Worker Service: An HTTP API server, managed by Bun and running on port 37777, acts as the central processing unit. It hosts a web viewer UI for real-time memory stream visualization and provides 10 dedicated search endpoints. This service is responsible for processing captured observations, generating summaries, and handling search queries.
  3. Data Storage:
    • SQLite Database: Serves as the primary persistent storage for structured data, including session metadata, raw observations, and compressed semantic summaries. It also utilizes FTS5 for full-text search capabilities.
    • Chroma Vector Database: Employed for vector embeddings, enabling hybrid semantic and keyword search. This allows for intelligent context retrieval based on conceptual similarity, complementing keyword-based searches.

The system's intelligence in context retrieval is largely powered by the mem-search skill, which leverages a token-efficient, 3-layer workflow using specific MCP (Multi-Component Protocol) tools:

  1. search: The initial step involves querying the memory index to obtain a compact list of relevant IDs. This operation is designed to be highly token-efficient, typically returning results with minimal overhead (e.g., โ‰ˆ50โˆ’100\approx 50-100 tokens per result).
  2. timeline: This tool allows Claude to retrieve chronological context surrounding specific observations or queries identified in the search phase, providing a temporal understanding of events.
  3. get_observations: Crucially, full details of observations are only fetched *after* filtering relevant IDs from the initial search. This targeted retrieval ensures significant token savings, as full observation details can be substantial (e.g., โ‰ˆ500โˆ’1000\approx 500-1000 tokens per result). This workflow achieves approximately 10x token savings by avoiding unnecessary transmission of large data payloads.

claude-mem injects context using a progressive disclosure strategy. It initially provides high-level summaries, and as Claude requires more detail, it can retrieve progressively deeper layers of information, with token cost visibility for transparency. Users can control context injection and exclude sensitive content using <private><private> tags. The system operates automatically, minimizing manual intervention. Citations for past observations are generated, accessible via specific API endpoints or the web viewer. An experimental beta channel offers features like "Endless Mode," which implements a biomimetic memory architecture for extended session continuity.

The project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0), with the ragtime/ directory licensed separately under the PolyForm Noncommercial License 1.0.0. System requirements include Node.js (18.0.0+), Claude Code (latest), Bun (auto-installed), and uv (Python package manager for vector search, auto-installed). Configuration settings are managed in ~/.claude-mem/settings.json, allowing fine-grained control over AI model, worker port, data directory, and context injection parameters.