Skip to content

QMD: local semantic search over markdown knowledge bases

reference

Keyword search over personal markdown vaults misses conceptual matches and requires remembering exact terminology

obsidianmarkdownqmdsearchsemantic-searchlocal-ai
49 views

Problem

You have a growing collection of markdown notes -- an Obsidian vault, a Zettelkasten, or a directory of project docs -- and you need to find relevant content by meaning, not just keywords. Standard text search (grep, rg, Obsidian's built-in search) only finds exact or fuzzy string matches. Searching for "how to handle team disagreements" won't find your note titled "conflict resolution framework" unless those exact words appear. As your knowledge base grows, keyword search becomes increasingly unreliable for retrieval.

Solution

Use QMD (Quick Markdown Search) by Tobias Lutke, which combines three search methods into a single pipeline that runs entirely locally.

Install and run:

# Clone the repository
git clone https://github.com/tobi/qmd.git
cd qmd

# Install dependencies
npm install

# Index a markdown directory
npx qmd index /path/to/your/vault

# Search semantically
npx qmd search "how to handle team disagreements"

How the search pipeline works:

StageMethodWhat it does
1BM25 full-text searchFast keyword matching with TF-IDF ranking
2Vector semantic searchEmbeds query and documents, finds conceptual matches
3LLM re-rankingUses a language model to score and reorder results by relevance

All three stages run locally via node-llama-cpp with GGUF models -- no API calls, no data leaving your machine.

Connecting QMD to Claude Code via a skill:

---
name: search-vault
description: Search the knowledge vault for relevant context
---

Run the following command to search the vault:

\`\`\`bash
cd /path/to/qmd && npx qmd search "$ARGUMENTS"
\`\`\`

Return the top results to inform your response.

Recommended search stack for full coverage:

QMD               --> semantic search over your Obsidian vault / markdown notes
claude-code-search --> search over Claude Code session logs and conversations
Obsidian           --> storage, editing, and wikilink graph navigation

Why It Works

Each search method compensates for the others' weaknesses. BM25 excels at precise keyword matches and is extremely fast, but misses synonyms and paraphrases. Vector search catches conceptual similarity ("team disagreements" matches "conflict resolution") but can surface tangentially related results. LLM re-ranking applies reasoning to the combined candidate set, promoting the most genuinely useful results to the top. Running everything locally via GGUF models means your personal notes never leave your machine, and search works offline.

Context

  • QMD is open source at github.com/tobi/qmd, created by Tobias Lutke (CEO of Shopify)
  • Runs on node-llama-cpp, so it works on macOS (Apple Silicon), Linux, and Windows with no GPU required
  • Pairs naturally with an Obsidian vault structured for AI agent access (numbered folders, per-folder CLAUDE.md files)
  • For Claude Code session log search specifically, claude-code-search parses the .jsonl files in ~/.claude/projects/
  • The GGUF model format means you can swap in different embedding or re-ranking models as better ones are released
About this share
Contributormblode
Repositorymblode/shares
CreatedFeb 10, 2026
View on GitHub