semantic-memory

Local semantic memory with PGlite + pgvector. Budget Qdrant that runs anywhere Bun runs.

Why

You want semantic search for your AI agents but don't want to run a vector database server. This gives you:

- Zero infrastructure - PGlite is Postgres compiled to WASM, runs in-process
- Real vector search - pgvector with HNSW indexes, not some janky cosine similarity loop
- Collection-based organization - Different collections for different contexts (codebase, research, notes)
- Configurable tool descriptions - The Qdrant MCP pattern: same tool, different behaviors via env vars
- Effect-TS - Proper error handling, resource management, composable services

Install

``bash

`npm/bun/pnpm`


npm install semantic-memory
Need Ollama for embeddings

brew install ollama
ollama pull mxbai-embed-large

CLI

`bash

`Via npx`


npx semantic-memory store "The auth flow uses JWT tokens stored in httpOnly cookies"
npx semantic-memory find "how does authentication work"
Or install globally

npm install -g semantic-memory
semantic-memory store "React component patterns" --collection code
semantic-memory find "components" --collection code
Full-text search (no embeddings)

semantic-memory find "JWT" --fts
Add metadata

semantic-memory store "API rate limits are 100 req/min" --metadata '{"source":"docs","priority":"high"}'
List, get, delete

semantic-memory list
semantic-memory get 
semantic-memory delete 
Validate a memory (refresh its relevance timestamp)

semantic-memory validate 
Stats

semantic-memory stats


Collections for Context
Collections let you organize memories by purpose. The collection name carries semantic meaning:

`bash

`Codebase analysis - store patterns, architecture notes, API quirks`


semantic-memory store "Auth uses httpOnly JWT cookies with 7-day refresh" --collection codebase
semantic-memory store "The useOptimistic hook requires a reducer pattern" --collection codebase
semantic-memory find "authentication" --collection codebase
Research/learning - concepts, connections, questions

semantic-memory store "Effect-TS uses generators for async, not Promises" --collection research
semantic-memory find "effect async patterns" --collection research
Project onboarding - gotchas, tribal knowledge, "why is it like this"

semantic-memory store "Don't use React.memo on components with children - causes stale closures" --collection gotchas
semantic-memory find "performance issues" --collection gotchas
Personal knowledge - decisions, preferences, breakthroughs

semantic-memory store "Prefer composition over inheritance for React components" --collection decisions
semantic-memory find "react patterns" --collection decisions


Search across all collections or within one:

`bash

`Search everything`


semantic-memory find "authentication"
Search specific collection

semantic-memory find "authentication" --collection codebase


The Qdrant Pattern
The killer feature: tool descriptions are configurable.
Same semantic memory, different agent behaviors:

`bash

`Codebase assistant - searches before generating, stores patterns found`


TOOL_STORE_DESCRIPTION="Store code patterns, architecture decisions, and API quirks discovered while analyzing the codebase. Include file paths and context." \
TOOL_FIND_DESCRIPTION="Search codebase knowledge. Query BEFORE making changes to understand existing patterns." \
semantic-memory find "auth patterns"
Research assistant - accumulates and connects ideas

TOOL_STORE_DESCRIPTION="Store concepts, insights, and connections between ideas. Include source references." \
TOOL_FIND_DESCRIPTION="Search research notes. Use to find related concepts and prior findings." \
semantic-memory find "async patterns"
Onboarding assistant - captures tribal knowledge

TOOL_STORE_DESCRIPTION="Store gotchas, workarounds, and 'why is it like this' explanations. Future devs will thank you." \
TOOL_FIND_DESCRIPTION="Search for known issues and gotchas. Check BEFORE debugging to avoid known pitfalls." \
semantic-memory find "common mistakes"


The description tells the LLM _when_ and _how_ to use the tool. Change the description, change the behavior. No code changes.
OpenCode Integration

Drop this in ~/.config/opencode/tool/semantic-memory.ts:

`typescript import { tool } from "@opencode-ai/plugin"; import { $ } from "bun";

// Rich descriptions that shape agent behavior // Override via env vars for different contexts const STORE_DESC = process.env.TOOL_STORE_DESCRIPTION || "Persist important discoveries, decisions, and learnings for future sessions. Use for: architectural decisions, debugging breakthroughs, user preferences, project-specific patterns. Include context about WHY something matters."; const FIND_DESC = process.env.TOOL_FIND_DESCRIPTION || "Search your persistent memory for relevant context. Query BEFORE making architectural decisions, when hitting familiar-feeling bugs, or when you need project history. Returns semantically similar memories ranked by relevance.";

async function run(args: string[]): Promise { const result = await $npx semantic-memory ${args}.text(); return result.trim(); }

export const store = tool({ description: STORE_DESC, args: { information: tool.schema.string().describe("The information to store"), collection: tool.schema .string() .optional() .describe("Collection name (e.g., 'codebase', 'research', 'gotchas')"), metadata: tool.schema .string() .optional() .describe("Optional JSON metadata"), }, async execute({ information, collection, metadata }) { const args = ["store", information]; if (collection) args.push("--collection", collection); if (metadata) args.push("--metadata", metadata); return run(args); }, });

export const find = tool({ description: FIND_DESC, args: { query: tool.schema.string().describe("Natural language search query"), collection: tool.schema .string() .optional() .describe("Collection to search (omit for all)"), limit: tool.schema .number() .optional() .describe("Max results (default: 10)"), }, async execute({ query, collection, limit }) { const args = ["find", query]; if (collection) args.push("--collection", collection); if (limit) args.push("--limit", String(limit)); return run(args); }, });`

`$3`

For project-specific behavior, create a wrapper script or use direnv:

`bash

`.envrc (with direnv)`


export TOOL_STORE_DESCRIPTION="Store patterns found in this Next.js codebase. Include file paths."
export TOOL_FIND_DESCRIPTION="Search codebase patterns. Check before implementing new features."


Or create project-specific OpenCode tools that hardcode the collection:

`typescript // .opencode/tool/codebase-memory.ts export const remember = tool({ description: "Store a pattern or insight about this codebase", args: { info: tool.schema.string() }, async execute({ info }) { return $npx semantic-memory store ${info} --collection ${process.cwd()}.text(); }, });`

`Configuration`

All via environment variables:

| Variable | Default | Description | | ----------------------------- | ------------------------ | ----------------------------------- | |SEMANTIC_MEMORY_PATH | ~/.semantic-memory| Where to store the database | |OLLAMA_HOST | http://localhost:11434| Ollama API endpoint | |OLLAMA_MODEL | mxbai-embed-large| Embedding model (1024 dims) | |COLLECTION_NAME | default| Default collection | |MEMORY_DECAY_HALF_LIFE_DAYS | 90| Days for confidence decay half-life | |TOOL_STORE_DESCRIPTION| (see code) | MCP tool description for store | |TOOL_FIND_DESCRIPTION | (see code) | MCP tool description for find |

`Confidence Decay`

Memories decay in relevance over time unless validated. This helps surface fresh, actively-used knowledge over stale information.

How it works:

- Uses a half-life algorithm: decay = 0.5 ^ (age_in_days / half_life)- Default half-life is 90 days (configurable viaMEMORY_DECAY_HALF_LIFE_DAYS) - Search scores are multiplied by the decay factor - Stale memories (>90 days) show a warning

Decay examples:

| Age | Decay Factor | Effect | | -------- | ------------ | -------------- | | Today | 1.0 | Full weight | | 90 days | 0.5 | Half weight | | 180 days | 0.25 | Quarter weight |

Example search output:

`Results (decay half-life: 90 days): 1. [score: 0.82, age: 3d, decay: 0.98] JWT tokens should use httpOnly cookies Collection: codebase | ID: mem_abc123 2. [score: 0.45, age: 120d, decay: 0.40] Use localStorage for auth tokens Collection: codebase | ID: mem_ghi789 ⚠️ Stale (120 days) - consider validating or removing`

Refreshing memories:

`bash

`Validate a memory to reset its decay (marks it as still relevant)`


semantic-memory validate

Use validate when you confirm a memory is still accurate and useful. This resets the decay clock.

`How It Works`

`┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Ollama │────▶│ PGlite │────▶│ pgvector │ │ (embeddings)│ │ (WASM PG) │ │ (HNSW idx) │ └─────────────┘ └─────────────┘ └─────────────┘ │ │ │ │ memories table memory_embeddings │ - id - memory_id (FK) │ - content - embedding vector(1024) │ - metadata (JSONB) │ - collection └──────────────────────────────────────────┘ cosine similarity search`

- Ollama generates embeddings locally with mxbai-embed-large` (1024 dimensions)
- PGlite is Postgres compiled to WASM - no server, runs in your process
- pgvector provides real vector operations with HNSW indexes for fast approximate nearest neighbor search
- Effect-TS handles errors, retries, and resource cleanup properly

Use Cases

$3

Store patterns, architecture decisions, and API quirks as you explore a new codebase. Query before making changes.

$3

Remember facts across AI sessions. No more re-explaining context every conversation.

$3

Pre-load docs into a collection, search before hallucinating answers.

$3

Accumulate findings, connect ideas across sources, build up domain knowledge.

$3

Capture the "why" behind decisions, known gotchas, and tribal knowledge for future team members.

License

MIT

semantic-memory

Local semantic memory with PGlite + pgvector. Budget Qdrant that runs anywhere Bun runs.

Why

You want semantic search for your AI agents but don't want to run a vector database server. This gives you:

Install

``bash

`npm/bun/pnpm`


npm install semantic-memory
Need Ollama for embeddings

brew install ollama
ollama pull mxbai-embed-large

CLI

`bash

`Via npx`


npx semantic-memory store "The auth flow uses JWT tokens stored in httpOnly cookies"
npx semantic-memory find "how does authentication work"
Or install globally

npm install -g semantic-memory
semantic-memory store "React component patterns" --collection code
semantic-memory find "components" --collection code
Full-text search (no embeddings)

semantic-memory find "JWT" --fts
Add metadata

semantic-memory store "API rate limits are 100 req/min" --metadata '{"source":"docs","priority":"high"}'
List, get, delete

semantic-memory list
semantic-memory get 
semantic-memory delete 
Validate a memory (refresh its relevance timestamp)

semantic-memory validate 
Stats

semantic-memory stats


Collections for Context
Collections let you organize memories by purpose. The collection name carries semantic meaning:

`bash

`Codebase analysis - store patterns, architecture notes, API quirks`


semantic-memory store "Auth uses httpOnly JWT cookies with 7-day refresh" --collection codebase
semantic-memory store "The useOptimistic hook requires a reducer pattern" --collection codebase
semantic-memory find "authentication" --collection codebase
Research/learning - concepts, connections, questions

semantic-memory store "Effect-TS uses generators for async, not Promises" --collection research
semantic-memory find "effect async patterns" --collection research
Project onboarding - gotchas, tribal knowledge, "why is it like this"

semantic-memory store "Don't use React.memo on components with children - causes stale closures" --collection gotchas
semantic-memory find "performance issues" --collection gotchas
Personal knowledge - decisions, preferences, breakthroughs

semantic-memory store "Prefer composition over inheritance for React components" --collection decisions
semantic-memory find "react patterns" --collection decisions


Search across all collections or within one:

`bash

`Search everything`


semantic-memory find "authentication"
Search specific collection

semantic-memory find "authentication" --collection codebase


The Qdrant Pattern
The killer feature: tool descriptions are configurable.
Same semantic memory, different agent behaviors:

`bash

`Codebase assistant - searches before generating, stores patterns found`


TOOL_STORE_DESCRIPTION="Store code patterns, architecture decisions, and API quirks discovered while analyzing the codebase. Include file paths and context." \
TOOL_FIND_DESCRIPTION="Search codebase knowledge. Query BEFORE making changes to understand existing patterns." \
semantic-memory find "auth patterns"
Research assistant - accumulates and connects ideas

TOOL_STORE_DESCRIPTION="Store concepts, insights, and connections between ideas. Include source references." \
TOOL_FIND_DESCRIPTION="Search research notes. Use to find related concepts and prior findings." \
semantic-memory find "async patterns"
Onboarding assistant - captures tribal knowledge

TOOL_STORE_DESCRIPTION="Store gotchas, workarounds, and 'why is it like this' explanations. Future devs will thank you." \
TOOL_FIND_DESCRIPTION="Search for known issues and gotchas. Check BEFORE debugging to avoid known pitfalls." \
semantic-memory find "common mistakes"


The description tells the LLM _when_ and _how_ to use the tool. Change the description, change the behavior. No code changes.
OpenCode Integration

Drop this in ~/.config/opencode/tool/semantic-memory.ts:

`typescript import { tool } from "@opencode-ai/plugin"; import { $ } from "bun";

async function run(args: string[]): Promise { const result = await $npx semantic-memory ${args}.text(); return result.trim(); }

`$3`

For project-specific behavior, create a wrapper script or use direnv:

`bash

`.envrc (with direnv)`


export TOOL_STORE_DESCRIPTION="Store patterns found in this Next.js codebase. Include file paths."
export TOOL_FIND_DESCRIPTION="Search codebase patterns. Check before implementing new features."


Or create project-specific OpenCode tools that hardcode the collection:

`Configuration`

All via environment variables:

`Confidence Decay`

Memories decay in relevance over time unless validated. This helps surface fresh, actively-used knowledge over stale information.

How it works:

Decay examples:

| Age | Decay Factor | Effect | | -------- | ------------ | -------------- | | Today | 1.0 | Full weight | | 90 days | 0.5 | Half weight | | 180 days | 0.25 | Quarter weight |

Example search output:

Refreshing memories:

`bash

`Validate a memory to reset its decay (marks it as still relevant)`


semantic-memory validate

Use validate when you confirm a memory is still accurate and useful. This resets the decay clock.

`How It Works`

Use Cases

$3

Store patterns, architecture decisions, and API quirks as you explore a new codebase. Query before making changes.

$3

Remember facts across AI sessions. No more re-explaining context every conversation.

$3

Pre-load docs into a collection, search before hallucinating answers.

$3

Accumulate findings, connect ideas across sources, build up domain knowledge.

$3

Capture the "why" behind decisions, known gotchas, and tribal knowledge for future team members.

License

MIT