Grep is Dead: How Claude Code Makes It Remember | GeekNews
Key Points
- 1To solve Claude Code's session context loss and inefficient traditional search, a new memory system combines QMD, a local search engine for Obsidian vaults, with a `/recall` skill.
- 2QMD offers BM25, semantic, and hybrid search modes, while the `/recall` skill enhances context restoration with temporal, topic, and graph visualization options for past sessions.
- 3This system maintains persistent memory across AI agents through an automated pipeline that parses and embeds 700 JSONL session histories into QMD, enabling a context-centric workflow.
This paper presents a memory system designed to address the critical issue of context loss in AI agents like Claude Code, which typically operate in a stateless, session-isolated manner. The proposed solution integrates a local search engine, QMD, with a custom Claude Code skill, /recall, underpinned by an automated data pipeline.
The core problem identified is that AI agents like Claude Code reset their context with each new session. Over 700 sessions in three weeks, this led to difficulty tracking past decisions and project context, with substantial information loss when hitting context limits (60%) or attempting to resume work on subsequent days. Traditional grep-based search proved to be inefficient, slow (e.g., 3 minutes for 300 files with poor quality for Claude Code's default Haiku sub-agent), and lacking scalability.
The solution leverages QMD (Query-driven Markdown), a local search engine specifically designed for indexing Obsidian vaults. QMD maps different vault folders (e.g., notes, daily logs, sessions, transcripts) to dedicated QMD collections, enabling focused searches and returning results within one second. QMD offers three distinct search modes:
- BM25 (qmd search): This is a deterministic full-text search method, analogous to
grepbut with sophisticated ranking. It assigns relevance scores to files based on term frequency (how often a term appears in a document) and inverse document frequency (the rarity of a term across the entire corpus), normalized by document length. It relies purely on mathematical computation without AI or embeddings. For instance, a term appearing five times in a short note would receive a higher score than a single occurrence in a much longer document.
- Term Frequency (TF): How often a term appears in document .
- Inverse Document Frequency (IDF): A measure of how much information the term provides, usually calculated as , where is the total number of documents and is the number of documents containing term .
- Document Length Normalization: Adjusts for document length, typically using parameters and .
where is the frequency of term in document , is the length of document , and is the average document length in the collection.
- Semantic (qmd vsearch): This mode is embedding-based, allowing searches by meaning rather than exact keywords. Documents and queries are converted into high-dimensional vector embeddings, and similarity is determined by calculating the cosine similarity between these vectors. This enables discovery of relevant information even if the precise search terms are not present in the document (e.g., searching "couldn't sleep, bad night" could find notes on "insomnia" or "sleep habits").
- Hybrid (qmd query): This mode combines BM25 and semantic search to provide an optimal ranking of results, leveraging the strengths of both deterministic keyword matching and conceptual understanding.
The paper recommends using BM25 for 80% of searches, especially for structured notes, and supplementing with semantic search for transcripts and unstructured memos. The power of semantic search is further demonstrated by Claude's ability to automatically combine multiple semantic searches (e.g., qmd vsearch "happy, grateful, excited") for complex, unstructured queries like "find the days when I was happy and what was the reason," revealing deep, meaningful patterns (e.g., "happiest days involved launching something and then good recovery with a sauna or 9 hours of sleep").
The /recall skill operates on top of QMD within Claude Code, automating context loading before task initiation. It supports three modes:
- Temporal: Scans session history based on date (
/recall yesterday,/recall last week). - Topic: Performs BM25 search on QMD collections (
/recall topic "QMD video"). - Graph: Provides an interactive visual representation of sessions and files, color-coding sessions by recency and clustering files by type (goals, research, docs, etc.). This allows users to visually identify and retrieve specific past sessions, even weeks later, by copying the file path directly into Claude Code.
An automated indexing pipeline is crucial for maintaining an up-to-date memory. Claude Code locally saves all conversations as JSONL files (700 sessions accumulated in 3 weeks). This pipeline automatically parses these raw files, extracts clean markdown (user messages, system signals), converts them, and embeds them into the QMD index. An automatic hook executes upon terminal exit, exporting and embedding the session into QMD without manual intervention, ensuring the index is always current.
This system facilitates a context-centric workflow, where the focus shifts from the AI tool itself to the persistent context. Tools (AI models, agents) may change, but the continuity of contextโstored and retrievable via QMD and /recallโensures workflow consistency across different agents like Claude Code, Codex, or Gemini CLI. All embeddings are stored locally, emphasizing privacy and control. The entire stack consists of Obsidian Vault at the base, QMD Search in the middle, and Claude Code/OpenClaw at the top, with context flowing upwards. This setup, including Obsidian Sync for cross-device vault synchronization and a 24/7 Mac Mini running OpenClaw, ensures ubiquitous access to the same vault, QMD index, and skills.
The system's utility is exemplified by its ability to rediscover "unacted upon ideas," finding notes and plans from months ago that had been completely forgotten. This demonstrates its capacity to transform static notes into a dynamic, useful context that actively aids in achieving goals.