Local semantic memory with PGlite + pgvector - budget Qdrant for AI agents
npm install semantic-memoryLocal semantic memory with PGlite + pgvector. Budget Qdrant that runs anywhere Bun runs.
You want semantic search for your AI agents but don't want to run a vector database server. This gives you:
- Zero infrastructure - PGlite is Postgres compiled to WASM, runs in-process
- Real vector search - pgvector with HNSW indexes, not some janky cosine similarity loop
- Collection-based organization - Different collections for different contexts (codebase, research, notes)
- Configurable tool descriptions - The Qdrant MCP pattern: same tool, different behaviors via env vars
- Effect-TS - Proper error handling, resource management, composable services
``bashnpm/bun/pnpm
npm install semantic-memory
CLI
`bash
Via npx
npx semantic-memory store "The auth flow uses JWT tokens stored in httpOnly cookies"
npx semantic-memory find "how does authentication work"Or install globally
npm install -g semantic-memory
semantic-memory store "React component patterns" --collection code
semantic-memory find "components" --collection codeFull-text search (no embeddings)
semantic-memory find "JWT" --ftsAdd metadata
semantic-memory store "API rate limits are 100 req/min" --metadata '{"source":"docs","priority":"high"}'List, get, delete
semantic-memory list
semantic-memory get
semantic-memory delete Validate a memory (refresh its relevance timestamp)
semantic-memory validate Stats
semantic-memory stats
`Collections for Context
Collections let you organize memories by purpose. The collection name carries semantic meaning:
`bash
Codebase analysis - store patterns, architecture notes, API quirks
semantic-memory store "Auth uses httpOnly JWT cookies with 7-day refresh" --collection codebase
semantic-memory store "The useOptimistic hook requires a reducer pattern" --collection codebase
semantic-memory find "authentication" --collection codebaseResearch/learning - concepts, connections, questions
semantic-memory store "Effect-TS uses generators for async, not Promises" --collection research
semantic-memory find "effect async patterns" --collection researchProject onboarding - gotchas, tribal knowledge, "why is it like this"
semantic-memory store "Don't use React.memo on components with children - causes stale closures" --collection gotchas
semantic-memory find "performance issues" --collection gotchasPersonal knowledge - decisions, preferences, breakthroughs
semantic-memory store "Prefer composition over inheritance for React components" --collection decisions
semantic-memory find "react patterns" --collection decisions
`Search across all collections or within one:
`bash
Search everything
semantic-memory find "authentication"Search specific collection
semantic-memory find "authentication" --collection codebase
`The Qdrant Pattern
The killer feature: tool descriptions are configurable.
Same semantic memory, different agent behaviors:
`bash
Codebase assistant - searches before generating, stores patterns found
TOOL_STORE_DESCRIPTION="Store code patterns, architecture decisions, and API quirks discovered while analyzing the codebase. Include file paths and context." \
TOOL_FIND_DESCRIPTION="Search codebase knowledge. Query BEFORE making changes to understand existing patterns." \
semantic-memory find "auth patterns"Research assistant - accumulates and connects ideas
TOOL_STORE_DESCRIPTION="Store concepts, insights, and connections between ideas. Include source references." \
TOOL_FIND_DESCRIPTION="Search research notes. Use to find related concepts and prior findings." \
semantic-memory find "async patterns"Onboarding assistant - captures tribal knowledge
TOOL_STORE_DESCRIPTION="Store gotchas, workarounds, and 'why is it like this' explanations. Future devs will thank you." \
TOOL_FIND_DESCRIPTION="Search for known issues and gotchas. Check BEFORE debugging to avoid known pitfalls." \
semantic-memory find "common mistakes"
`The description tells the LLM _when_ and _how_ to use the tool. Change the description, change the behavior. No code changes.
OpenCode Integration
Drop this in
~/.config/opencode/tool/semantic-memory.ts:`typescript
import { tool } from "@opencode-ai/plugin";
import { $ } from "bun";// Rich descriptions that shape agent behavior
// Override via env vars for different contexts
const STORE_DESC =
process.env.TOOL_STORE_DESCRIPTION ||
"Persist important discoveries, decisions, and learnings for future sessions. Use for: architectural decisions, debugging breakthroughs, user preferences, project-specific patterns. Include context about WHY something matters.";
const FIND_DESC =
process.env.TOOL_FIND_DESCRIPTION ||
"Search your persistent memory for relevant context. Query BEFORE making architectural decisions, when hitting familiar-feeling bugs, or when you need project history. Returns semantically similar memories ranked by relevance.";
async function run(args: string[]): Promise {
const result = await $
npx semantic-memory ${args}.text();
return result.trim();
}export const store = tool({
description: STORE_DESC,
args: {
information: tool.schema.string().describe("The information to store"),
collection: tool.schema
.string()
.optional()
.describe("Collection name (e.g., 'codebase', 'research', 'gotchas')"),
metadata: tool.schema
.string()
.optional()
.describe("Optional JSON metadata"),
},
async execute({ information, collection, metadata }) {
const args = ["store", information];
if (collection) args.push("--collection", collection);
if (metadata) args.push("--metadata", metadata);
return run(args);
},
});
export const find = tool({
description: FIND_DESC,
args: {
query: tool.schema.string().describe("Natural language search query"),
collection: tool.schema
.string()
.optional()
.describe("Collection to search (omit for all)"),
limit: tool.schema
.number()
.optional()
.describe("Max results (default: 10)"),
},
async execute({ query, collection, limit }) {
const args = ["find", query];
if (collection) args.push("--collection", collection);
if (limit) args.push("--limit", String(limit));
return run(args);
},
});
`$3
For project-specific behavior, create a wrapper script or use direnv:
`bash
.envrc (with direnv)
export TOOL_STORE_DESCRIPTION="Store patterns found in this Next.js codebase. Include file paths."
export TOOL_FIND_DESCRIPTION="Search codebase patterns. Check before implementing new features."
`Or create project-specific OpenCode tools that hardcode the collection:
`typescript
// .opencode/tool/codebase-memory.ts
export const remember = tool({
description: "Store a pattern or insight about this codebase",
args: { info: tool.schema.string() },
async execute({ info }) {
return $npx semantic-memory store ${info} --collection ${process.cwd()}.text();
},
});
`Configuration
All via environment variables:
| Variable | Default | Description |
| ----------------------------- | ------------------------ | ----------------------------------- |
|
SEMANTIC_MEMORY_PATH | ~/.semantic-memory | Where to store the database |
| OLLAMA_HOST | http://localhost:11434 | Ollama API endpoint |
| OLLAMA_MODEL | mxbai-embed-large | Embedding model (1024 dims) |
| COLLECTION_NAME | default | Default collection |
| MEMORY_DECAY_HALF_LIFE_DAYS | 90 | Days for confidence decay half-life |
| TOOL_STORE_DESCRIPTION | (see code) | MCP tool description for store |
| TOOL_FIND_DESCRIPTION | (see code) | MCP tool description for find |Confidence Decay
Memories decay in relevance over time unless validated. This helps surface fresh, actively-used knowledge over stale information.
How it works:
- Uses a half-life algorithm:
decay = 0.5 ^ (age_in_days / half_life)
- Default half-life is 90 days (configurable via MEMORY_DECAY_HALF_LIFE_DAYS)
- Search scores are multiplied by the decay factor
- Stale memories (>90 days) show a warningDecay examples:
| Age | Decay Factor | Effect |
| -------- | ------------ | -------------- |
| Today | 1.0 | Full weight |
| 90 days | 0.5 | Half weight |
| 180 days | 0.25 | Quarter weight |
Example search output:
`
Results (decay half-life: 90 days):
1. [score: 0.82, age: 3d, decay: 0.98] JWT tokens should use httpOnly cookies
Collection: codebase | ID: mem_abc123
2. [score: 0.45, age: 120d, decay: 0.40] Use localStorage for auth tokens
Collection: codebase | ID: mem_ghi789
⚠️ Stale (120 days) - consider validating or removing
`Refreshing memories:
`bash
Validate a memory to reset its decay (marks it as still relevant)
semantic-memory validate
`Use
validate when you confirm a memory is still accurate and useful. This resets the decay clock.How It Works
`
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Ollama │────▶│ PGlite │────▶│ pgvector │
│ (embeddings)│ │ (WASM PG) │ │ (HNSW idx) │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
│ memories table memory_embeddings
│ - id - memory_id (FK)
│ - content - embedding vector(1024)
│ - metadata (JSONB)
│ - collection
└──────────────────────────────────────────┘
cosine similarity search
`- Ollama generates embeddings locally with
mxbai-embed-large` (1024 dimensions)Store patterns, architecture decisions, and API quirks as you explore a new codebase. Query before making changes.
Remember facts across AI sessions. No more re-explaining context every conversation.
Pre-load docs into a collection, search before hallucinating answers.
Accumulate findings, connect ideas across sources, build up domain knowledge.
Capture the "why" behind decisions, known gotchas, and tribal knowledge for future team members.
MIT