Advanced memory management system for LLM agents with Letta-inspired features
npm install memedgeAdvanced memory management system for LLM agents with Letta-inspired features
Memedge is a sophisticated memory system designed for building stateful LLM agents on Cloudflare Workers. Inspired by Letta (formerly MemGPT), it provides structured memory blocks, semantic search, recursive summarization, and privacy-aware memory management.
- ๐ฏ Structured Memory Blocks: Organize information into core blocks (human, persona, context) and custom blocks
- ๐ Semantic Search: Built-in semantic search using Cloudflare AI embeddings (no external vector DB needed!)
- ๐ Archival Memory: Long-term storage with searchable history
- ๐ Recursive Summarization: Hierarchical conversation summarization for managing long-term context
- ๐ Privacy-Aware: Built-in privacy markers ([PRIVATE], [CONFIDENTIAL], [DO NOT SHARE])
- โก Edge-Native: Optimized for Cloudflare Workers with Durable Objects
- ๐ ๏ธ LLM Tool Integration: Ready-to-use tool definitions for function calling
- ๐พ SQL-Based: Uses Cloudflare Durable Objects SQL for persistence
- ๐จ Effect-Based: Leverages Effect for type-safe error handling
``bash`
npm install memedgeor
yarn add memedgeor
pnpm add memedge
`typescript
import { Effect } from 'effect';
import {
MemoryManagerLive,
SqlStorageContext
} from 'memedge/memory';
// Setup SQL storage context
const sqlContext = SqlStorageContext.of({ sql: durableObjectSQL });
// Create and use memory manager
const program = Effect.gen(function* () {
const memoryManager = yield* MemoryManagerService;
// Initialize database
yield* memoryManager.initializeDatabase();
// Write memory
yield* memoryManager.writeMemory('user_profile', 'Name: Alice, Role: Engineer');
// Read memory
const entry = yield* memoryManager.readMemory('user_profile');
console.log(entry?.text);
});
// Run with context
Effect.runPromise(
program.pipe(
Effect.provide(MemoryManagerLive),
Effect.provide(Layer.succeed(SqlStorageContext, sqlContext))
)
);
`
`typescript
import {
MemoryBlockManagerLive,
MemoryBlockManagerService
} from 'memedge/memory';
const program = Effect.gen(function* () {
const manager = yield* MemoryBlockManagerService;
// Create a memory block
yield* manager.createBlock(
'human',
'Human',
'Name: Alice\nRole: Software Engineer\nPrefers: Concise responses',
'core'
);
// Insert content
yield* manager.insertContent(
'human',
'Company: TechCorp',
'end'
);
// Replace content
yield* manager.replaceContent(
'human',
'Concise responses',
'Detailed explanations'
);
// Get block
const block = yield* manager.getBlock('human');
console.log(block?.content);
});
`
`typescript
import {
searchMemoryBlocks,
generateEmbedding,
AiBindingContext
} from 'memedge/memory';
const program = Effect.gen(function* () {
const manager = yield* MemoryBlockManagerService;
const blocks = yield* manager.getAllBlocks();
// Search memory blocks semantically
const results = yield* searchMemoryBlocks(
'health information',
blocks,
5, // limit
0.5 // threshold
);
results.forEach(r => {
console.log(${r.block.label}: ${r.score});
console.log(r.block.content);
});
});
// Provide AI binding for embeddings
Effect.runPromise(
program.pipe(
Effect.provide(MemoryBlockManagerLive),
Effect.provide(Layer.succeed(AiBindingContext, { ai: env.AI }))
)
);
`
`typescript
import {
createBaseSummary,
checkRecursiveSummarizationNeeded,
createRecursiveSummary
} from 'memedge/summaries';
const program = Effect.gen(function* () {
// Create base summary from messages
const summaryId = yield* createBaseSummary(messages, persona);
// Check if recursive summarization is needed
const check = yield* checkRecursiveSummarizationNeeded();
if (check.needed && check.summaries) {
// Create recursive summary
const recursiveId = yield* createRecursiveSummary(
check.summaries,
check.level!,
persona
);
console.log(Created level ${check.level} summary: ${recursiveId});`
}
});
Memedge provides ready-to-use tool definitions for LLM function calling:
`typescript
import {
getMemoryTools,
getEnhancedMemoryTools,
getAllMemoryTools
} from 'memedge/tools';
// Basic tools
const basicTools = getMemoryTools();
// { memory_read, memory_write }
// Enhanced Letta-style tools
const enhancedTools = getEnhancedMemoryTools();
// {
// memory_get_block, memory_insert, memory_replace,
// memory_rethink, memory_create_block, memory_list_blocks,
// archival_insert, archival_search, memory_search
// }
// All tools (enhanced + legacy)
const allTools = getAllMemoryTools();
// Use with your LLM provider
const response = await generateText({
model: openai('gpt-4'),
tools: allTools,
// ...
});
`
`typescript
import {
executeMemoryGetBlock,
executeMemoryInsert,
executeMemorySearch
} from 'memedge/tools';
// Execute tool based on LLM response
if (toolCall.name === 'memory_get_block') {
const result = yield* executeMemoryGetBlock(toolCall.args);
// { block_id, label, content, updated_at }
}
if (toolCall.name === 'memory_insert') {
const result = yield* executeMemoryInsert(toolCall.args);
// { success, message }
}
if (toolCall.name === 'memory_search') {
const result = yield* executeMemorySearch({
...toolCall.args,
useSemanticSearch: true
});
// { results: [{ block_id, label, content, score }] }
}
`
Memory blocks are structured containers for different types of information:
- Core Blocks: Always loaded into context (human, persona, context, custom)
- Archival Blocks: Searchable long-term storage, loaded on-demand
- Operations: insert, replace, rethink (complete rewrite)
Memedge supports privacy-aware memory with built-in markers:
`typescript
// Store private information
yield* memoryManager.writeMemory(
'health_info',
'[PRIVATE] Allergic to penicillin. [CONFIDENTIAL] Therapy on Tuesdays.'
);
// The system respects these markers when sharing information
`
Supported markers:
- [PRIVATE] - Personal information[CONFIDENTIAL]
- - Confidential data[DO NOT SHARE]
- - Explicitly not shareable[PERSONAL]
- - Personal notes
Memedge uses a simple but effective approach to semantic search:
1. Embeddings Generation: Uses Cloudflare AI (@cf/baai/bge-base-en-v1.5, 768 dimensions)
2. Storage: Embeddings stored as JSON in SQL (no separate vector DB!)
3. Search: Cosine similarity computed in-worker
4. Performance: Sub-50ms search latency for typical queries
5. Cost: Included in Cloudflare Workers costs
Hierarchical conversation summarization for managing long-term context:
``
Level 0: Base Summaries (20 messages each)
Level 1: Meta-Summaries (10 x L0)
Level 2: Super-Summaries (10 x L1)
Level 3: Ultra-Summaries (10 x L2)
This logarithmic approach keeps context manageable even with thousands of messages.
``
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Memedge System โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Memory Manager โ โ Memory Block Manager โ โ
โ โ (Legacy KV) โ โ (Letta-style) โ โ
โ โ โ โ โ โ
โ โ โข purpose/text โ โ โข Structured blocks โ โ
โ โ โข Privacy โ โ โข Core + Archival โ โ
โ โ markers โ โ โข insert/replace/rethink โ โ
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโโโโโโโฌโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Semantic Search (Cloudflare AI) โ โ
โ โ โ โ
โ โ โข Generate embeddings (768D) โ โ
โ โ โข Store in SQL as JSON โ โ
โ โ โข Cosine similarity search โ โ
โ โ โข No external vector DB โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Recursive Summarization โ โ
โ โ โ โ
โ โ โข Base summaries (L0) โ โ
โ โ โข Recursive meta-summaries (L1, L2, L3) โ โ
โ โ โข Hierarchical context compression โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Durable Objects SQL Storage โ โ
โ โ โ โ
โ โ โข agent_memory (legacy) โ โ
โ โ โข memory_blocks (structured) โ โ
โ โ โข archival_memory (long-term) โ โ
โ โ โข memory_embeddings (vectors) โ โ
โ โ โข conversation_summaries_v2 (recursive) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
`typescript`
const config: SummarizationConfig = {
baseSummaryThreshold: 20, // Messages before L0 summary
recursiveThreshold: 10, // Summaries before next level
maxLevel: 3, // Maximum recursion depth
recentSummaryCount: 3 // Recent summaries to load
};
`typescript`
// Search with custom threshold and limit
const results = yield* searchMemoryBlocks(
query,
blocks,
10, // limit: max results
0.7 // threshold: minimum similarity score
);
See the API documentation for detailed API reference.
`bashRun tests
npm test
Contributions are welcome! Please read our Contributing Guide for details.
MIT License - see LICENSE file for details.
- Inspired by Letta (MemGPT) - Thank you to the Letta team for pioneering advanced memory systems for LLM agents
- Built for Cloudflare Workers
- Powered by Effect
- Documentation
- Examples
- Changelog
- Roadmap
| Feature | Memedge | Letta |
|---------|---------|-------|
| Architecture | Cloudflare Workers + Durable Objects | Python + PostgreSQL + Vector DB |
| Memory Blocks | โ
Core + Archival | โ
Core + Archival |
| Semantic Search | โ
Built-in (Cloudflare AI) | โ
External Vector DB |
| Embeddings | 768D, stored in SQL | Configurable, separate DB |
| Latency | ~30-50ms (edge) | ~100-200ms (server) |
| Scalability | Edge-native, globally distributed | Server-based |
| Privacy Markers | โ
Built-in | โ Not included |
| Recursive Summarization | โ
Hierarchical | โ Simple |
| Tool Integration | โ
Zod schemas | โ
Pydantic |
| Cost | Included in Workers | Separate services |
| Visual Tools | โ Code-first | โ
Agent Dev Environment |
---
Made with โค๏ธ for the LLM agent community