A complete Retrieval-Augmented Generation system using pgvector, LangChain, and LangGraph for Node.js applications with dynamic embedding and model providers, structured data queries, and chat history - supports OpenAI, Anthropic, HuggingFace, Azure, Goog
npm install rag-system-pgvectorbash
npm install rag-system-pgvector
Choose your AI provider (one or more):
npm install @langchain/openai # For OpenAI
npm install @langchain/anthropic # For Anthropic Claude
npm install @langchain/azure-openai # For Azure OpenAI
npm install @langchain/google-genai # For Google AI
npm install @langchain/community # For HuggingFace, Ollama, etc.
`
๐ Quick Start
$3
`javascript
import { RAGSystem } from 'rag-system-pgvector';
import { OpenAIEmbeddings, ChatOpenAI } from '@langchain/openai';
// Create provider instances
const embeddings = new OpenAIEmbeddings({
openAIApiKey: 'your-openai-api-key',
modelName: 'text-embedding-ada-002',
});
const llm = new ChatOpenAI({
openAIApiKey: 'your-openai-api-key',
modelName: 'gpt-4',
temperature: 0.7,
});
// Initialize RAG system
const rag = new RAGSystem({
database: {
host: 'localhost',
database: 'your_db',
username: 'postgres',
password: 'your_password'
},
embeddings: embeddings,
llm: llm,
embeddingDimensions: 1536,
});
await rag.initialize();
// Add documents and query
await rag.addDocuments(['./docs/file1.pdf', './docs/file2.txt']);
// Simple query
const result = await rag.query("What is the main topic?");
console.log(result.answer);
// Query with structured data for precise responses
const structuredResult = await rag.query("Tell me about iPhone features", {
structuredData: {
intent: "product_information",
entities: { product: "iPhone", category: "smartphone" },
constraints: ["Focus on latest features", "Include specifications"],
responseFormat: "structured_list"
}
});
console.log(structuredResult.answer);
`
$3
`javascript
import { RAGSystem } from 'rag-system-pgvector';
import { OpenAIEmbeddings } from '@langchain/openai';
import { ChatAnthropic } from '@langchain/anthropic';
// Use OpenAI for embeddings, Anthropic for chat
const embeddings = new OpenAIEmbeddings({
openAIApiKey: 'your-openai-api-key',
modelName: 'text-embedding-ada-002',
});
const llm = new ChatAnthropic({
anthropicApiKey: 'your-anthropic-api-key',
modelName: 'claude-3-haiku-20240307',
temperature: 0.7,
});
const rag = new RAGSystem({
database: { / your config / },
embeddings: embeddings,
llm: llm,
embeddingDimensions: 1536,
});
`
$3
`javascript
import { RAGSystem } from 'rag-system-pgvector';
import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers';
import { Ollama } from '@langchain/community/llms/ollama';
// Use local models (no API keys required)
const embeddings = new HuggingFaceTransformersEmbeddings({
modelName: 'sentence-transformers/all-MiniLM-L6-v2',
});
const llm = new Ollama({
baseUrl: 'http://localhost:11434',
model: 'llama2',
});
const rag = new RAGSystem({
database: { / your config / },
embeddings: embeddings,
llm: llm,
embeddingDimensions: 384, // all-MiniLM-L6-v2 dimensions
});
`
$3
`javascript
import { DocumentProcessor } from 'rag-system-pgvector/utils';
const processor = new DocumentProcessor();
// Process document from Buffer
const buffer = fs.readFileSync('document.pdf');
const result = await processor.processDocumentFromBuffer(
buffer,
'document.pdf',
'pdf',
{ source: 'api-upload', category: 'research' }
);
console.log(result.chunks); // Processed chunks with embeddings
`
$3
`javascript
import { DocumentProcessor } from 'rag-system-pgvector/utils';
const processor = new DocumentProcessor();
// Process single URL
const result = await processor.processDocumentFromUrl(
'https://example.com/document.pdf',
{ source: 'web-crawl', priority: 'high' }
);
// Process multiple URLs
const urls = [
'https://example.com/doc1.pdf',
'https://example.com/doc2.html',
'https://example.com/doc3.md'
];
const results = await processor.processDocumentsFromUrls(urls, {
source: 'batch-import',
maxConcurrent: 3
});
console.log(Processed ${results.successful.length} documents);
`
๐ฏ Structured Data Queries (New in v2.2.0)
The RAG system now supports structured JSON data alongside natural language queries for more precise and contextual responses.
$3
`javascript
const result = await rag.query("Tell me about iPhone features", {
structuredData: {
intent: "product_information",
entities: {
product: "iPhone",
category: "smartphone",
brand: "Apple"
},
constraints: [
"Focus on latest model features",
"Include technical specifications"
],
context: {
userType: "potential_buyer",
priceRange: "premium"
},
responseFormat: "structured_list"
}
});
`
$3
`javascript
const result = await rag.query("My device won't connect to WiFi", {
structuredData: {
intent: "troubleshooting",
entities: {
issue_type: "connectivity",
device_category: "mobile",
problem_area: "wifi"
},
constraints: [
"Provide step-by-step solution",
"Include alternative methods"
],
responseFormat: "step_by_step_guide"
}
});
`
$3
`javascript
const result = await rag.query("Compare iPhone vs Samsung Galaxy", {
structuredData: {
intent: "comparison",
entities: {
item1: "iPhone",
item2: "Samsung Galaxy"
},
constraints: [
"Compare key specifications",
"Highlight main differences"
],
responseFormat: "comparison_table"
}
});
`
$3
`javascript
const result = await rag.query("What about the camera quality?", {
chatHistory: [
{ role: 'user', content: 'Tell me about iPhone features' },
{ role: 'assistant', content: 'The iPhone offers excellent features...' }
],
structuredData: {
intent: "follow_up_question",
entities: {
topic: "camera",
context_reference: "previous_iphone_discussion"
},
responseFormat: "detailed_explanation"
}
});
`
$3
`typescript
interface StructuredData {
intent: string; // Query intent/category (required)
entities?: { // Named entities and values
[key: string]: string | number;
};
constraints?: string[]; // Requirements/constraints
context?: { // Additional context
[key: string]: string | number | boolean;
};
responseFormat?: string; // Desired response format
}
`
$3
- product_information - Product details and specifications
- troubleshooting - Problem-solving and technical support
- comparison - Comparing multiple items
- how_to_guide - Step-by-step instructions
- explanation - Detailed explanations
- follow_up_question - Context-aware follow-ups
$3
- structured_list - Organized bullet points
- step_by_step_guide - Numbered instructions
- comparison_table - Side-by-side comparison
- detailed_explanation - Comprehensive explanation
- bullet_points - Simple bullet format
- json_format - Structured JSON response
$3
`javascript
import RAGSystem from 'rag-system-pgvector';
const rag = new RAGSystem(config);
await rag.initialize();
// Add documents with user/knowledgebot metadata
const documentData = await processor.processDocumentFromBuffer(
buffer,
'user-manual.pdf',
'pdf',
{
userId: 'user_123',
knowledgebotId: 'tech_support_bot',
department: 'engineering',
priority: 'high'
}
);
await rag.documentStore.saveDocument(documentData);
// Query with user filtering
const userResults = await rag.query('What technical info is available?', {
userId: 'user_123',
limit: 5
});
// Query with knowledgebot filtering
const botResults = await rag.query('Help with technical issues', {
knowledgebotId: 'tech_support_bot'
});
// Query with multiple filters
const filteredResults = await rag.query('Show important documents', {
userId: 'user_123',
filter: {
priority: 'high',
department: 'engineering'
}
});
// Direct search with filtering
const searchResults = await rag.searchDocumentsByUserId(
'documentation',
'user_123'
);
// Get all documents for a specific user
const userDocs = await rag.getDocumentsByUserId('user_123');
`
$3
Enable multi-turn conversations with persistent chat history stored in PostgreSQL.
#### Basic Chat History
`javascript
// First query
const result1 = await rag.query('What is machine learning?');
// Follow-up with context
const result2 = await rag.query('Can you give me examples?', {
chatHistory: result1.chatHistory
});
// Another follow-up
const result3 = await rag.query('Which one is most popular?', {
chatHistory: result2.chatHistory
});
`
#### Session Persistence
`javascript
const sessionId = 'user_conversation_123';
// Query with automatic session save/load
const result = await rag.query('What is machine learning?', {
sessionId: sessionId,
persistSession: true, // Auto-save after query
userId: 'user_456',
knowledgebotId: 'tech_bot'
});
// Continue conversation (automatically loads history)
const result2 = await rag.query('Tell me more', {
sessionId: sessionId,
persistSession: true
});
// Load session manually
const session = await rag.loadSession(sessionId);
console.log(Session has ${session.messageCount} messages);
// Get all user sessions
const userSessions = await rag.getUserSessions('user_456');
console.log(User has ${userSessions.length} sessions);
// Get session statistics
const stats = await rag.getSessionStats({ userId: 'user_456' });
console.log(Total messages: ${stats.totalMessages});
`
#### History Summarization
`javascript
// Long conversations are automatically managed
const result = await rag.query('Complex question', {
sessionId: sessionId,
persistSession: true,
maxHistoryLength: 20 // Keeps recent 20 messages
});
`
#### Testing Chat Features
`bash
Basic chat history
npm run test:chat:basic
Session management
npm run test:chat:session
History summarization
npm run test:chat:summarization
Session persistence
npm run test:chat:persistence
`
Documentation:
- ๐ Chat History Implementation Guide
- ๐ Session Persistence Guide
- ๐ Chat History Summarization
๐ API Documentation
$3
The DocumentProcessor class provides powerful document processing capabilities for files, buffers, and URLs.
#### Buffer Processing Methods
##### processDocumentFromBuffer(buffer, fileName, fileType, metadata = {})
Process a document directly from a memory buffer.
`javascript
import { DocumentProcessor } from 'rag-system-pgvector/utils';
const processor = new DocumentProcessor();
const buffer = Buffer.from('This is a test document', 'utf8');
const result = await processor.processDocumentFromBuffer(
buffer,
'test.txt',
'txt',
{ source: 'api', category: 'test' }
);
// Returns:
// {
// title: 'Test Document',
// content: 'This is a test document',
// chunks: [...], // Array of processed chunks with embeddings
// metadata: { ... },
// fileType: 'txt',
// filePath: 'test.txt'
// }
`
Parameters:
- buffer (Buffer): The document content as a Buffer object
- fileName (string): Name of the file (used for metadata)
- fileType (string): File type ('pdf', 'docx', 'txt', 'html', 'md', 'json')
- metadata (object): Additional metadata to attach to the document
Supported Buffer Types:
- TXT: Plain text files
- HTML: HTML documents (extracts text content)
- Markdown: Markdown files
- JSON: JSON files (converts to readable text)
##### extractTextFromBuffer(buffer, fileType)
Extract raw text from a buffer without processing into chunks.
`javascript
const text = await processor.extractTextFromBuffer(buffer, 'html');
console.log(text); // Extracted plain text
`
#### URL Processing Methods
##### processDocumentFromUrl(url, metadata = {})
Download and process a document from a URL.
`javascript
const result = await processor.processDocumentFromUrl(
'https://example.com/document.pdf',
{
source: 'web-crawl',
priority: 'high',
category: 'research'
}
);
// Automatically detects file type from URL and content headers
// Downloads to temp directory and processes
`
Parameters:
- url (string): HTTP/HTTPS URL to download from
- metadata (object): Additional metadata for the document
Features:
- Automatic file type detection from URL extension and Content-Type headers
- Temporary file handling (auto-cleanup)
- Support for redirects and various HTTP response types
- Comprehensive error handling
##### processDocumentsFromUrls(urls, options = {})
Process multiple URLs in parallel with concurrency control.
`javascript
const urls = [
'https://site1.com/doc1.pdf',
'https://site2.com/doc2.html',
'https://site3.com/doc3.md'
];
const results = await processor.processDocumentsFromUrls(urls, {
maxConcurrent: 3, // Process up to 3 URLs simultaneously
metadata: { batch: 'import-2024' },
timeout: 30000, // 30 second timeout per URL
retries: 2 // Retry failed downloads
});
// Returns:
// {
// successful: [...], // Array of successfully processed documents
// failed: [...], // Array of failed URLs with error details
// total: 3,
// successCount: 2,
// failureCount: 1
// }
`
Options:
- maxConcurrent (number): Maximum concurrent downloads (default: 5)
- metadata (object): Metadata applied to all documents
- timeout (number): Timeout per URL in milliseconds
- retries (number): Number of retry attempts for failed downloads
#### Error Handling
All methods include comprehensive error handling:
`javascript
try {
const result = await processor.processDocumentFromBuffer(buffer, 'test.pdf', 'pdf');
} catch (error) {
if (error.message.includes('Buffer is empty')) {
console.log('Empty buffer provided');
} else if (error.message.includes('Unsupported file type')) {
console.log('File type not supported for buffer processing');
} else {
console.log('Processing error:', error.message);
}
}
`
#### Integration with RAG System
Use processed documents with the RAG system:
`javascript
import RAGSystem from 'rag-system-pgvector';
import { DocumentProcessor } from 'rag-system-pgvector/utils';
const rag = new RAGSystem(config);
const processor = new DocumentProcessor();
await rag.initialize();
// Process from buffer
const buffer = fs.readFileSync('document.pdf');
const processed = await processor.processDocumentFromBuffer(buffer, 'doc.pdf', 'pdf');
// Add to RAG system
await rag.documentStore.saveDocument(processed);
// Process from URL and add to RAG
const urlProcessed = await processor.processDocumentFromUrl('https://example.com/doc.html');
await rag.documentStore.saveDocument(urlProcessed);
// Now query across all documents
const answer = await rag.query('What information is available?');
`
๐ With Web Interface
`javascript
const rag = new RAGSystem({
// ... configuration
server: { port: 3000, enableWebUI: true }
});
await rag.initialize();
await rag.startServer();
// Visit http://localhost:3000
`
๐ Documentation
- ๐ Complete Package Documentation - Full API reference and examples
- ๐ง Integration Guide - Step-by-step integration examples
- ๐ฏ Examples - Ready-to-run examples
โก Quick Examples
Run the included examples:
`bash
Basic usage example
npm run example:basic
Web server example
npm run example:server
Advanced integration example
npm run example:advanced
Usage patterns overview
npm run example:patterns
`
๐ ๏ธ Development & Contributing
For local development and contributions:
$3
- Node.js v18+
- PostgreSQL v12+ with pgvector extension
- OpenAI API Key
$3
`bash
Clone and install
git clone https://github.com/yourusername/rag-system-pgvector.git
cd rag-system-pgvector
npm install
Configure environment
cp .env.example .env
Edit .env with your credentials
Initialize database
npm run setup
Start development
npm run dev
`
$3
`bash
Run examples
npm run example:basic
Run with web interface
npm run example:server
`
`bash
curl -X POST http://localhost:3000/documents/upload \
-F "document=@path/to/your/document.pdf" \
-F "title=My Document"
`
#### Process Document from File Path
`bash
curl -X POST http://localhost:3000/documents/process \
-H "Content-Type: application/json" \
-d '{
"filePath": "/path/to/document.pdf",
"title": "My Document"
}'
`
#### Search/Query
`bash
curl -X POST http://localhost:3000/search \
-H "Content-Type: application/json" \
-d '{
"query": "What is the main topic of the document?",
"sessionId": "optional-session-id"
}'
`
#### Get All Documents
`bash
curl http://localhost:3000/documents
`
#### Get Specific Document
`bash
curl http://localhost:3000/documents/{document-id}
`
#### Delete Document
`bash
curl -X DELETE http://localhost:3000/documents/{document-id}
`
$3
#### Process Documents from Directory
`bash
npm run process-docs /path/to/documents/folder
`
#### Interactive Search
`bash
npm run search
`
#### Single Query Search
`bash
npm run search "Your question here"
`
๐๏ธ Architecture
$3
1. Document Processor (src/utils/documentProcessor.js)
- Extracts text from various file formats
- Splits documents into chunks with configurable overlap
- Generates embeddings using OpenAI
2. Document Store (src/services/documentStore.js)
- Manages document and chunk storage in PostgreSQL
- Performs vector similarity search using pgvector
- Handles CRUD operations
3. RAG Workflow (src/workflows/ragWorkflow.js)
- LangGraph-based workflow orchestration
- Three-step process: Retrieve โ Rerank โ Generate
- Supports conversational context
4. API Server (src/index.js)
- Express.js REST API
- File upload handling
- Conversation session management
$3
`sql
-- Documents table
CREATE TABLE documents (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
title VARCHAR(255) NOT NULL,
content TEXT NOT NULL,
file_path VARCHAR(500),
file_type VARCHAR(50),
metadata JSONB DEFAULT '{}',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Document chunks with embeddings
CREATE TABLE document_chunks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
embedding vector(1536),
metadata JSONB DEFAULT '{}',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Search sessions for tracking
CREATE TABLE search_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
query TEXT NOT NULL,
results JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Chat Sessions for conversation persistence (NEW)
CREATE TABLE chat_sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id VARCHAR(255) UNIQUE NOT NULL,
user_id VARCHAR(255),
knowledgebot_id VARCHAR(255),
history JSONB DEFAULT '[]'::jsonb,
metadata JSONB DEFAULT '{}'::jsonb,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
last_activity TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
message_count INTEGER DEFAULT 0
);
-- Indexes for chat sessions
CREATE INDEX idx_chat_sessions_session_id ON chat_sessions(session_id);
CREATE INDEX idx_chat_sessions_user_id ON chat_sessions(user_id);
CREATE INDEX idx_chat_sessions_knowledgebot_id ON chat_sessions(knowledgebot_id);
CREATE INDEX idx_chat_sessions_last_activity ON chat_sessions(last_activity);
`
$3
`mermaid
graph TD
A[Query Input] --> B[Retrieve Node]
B --> C[Rerank Node]
C --> D[Generate Node]
D --> E[Response Output]
B --> F[Vector Search]
F --> G[Similar Chunks]
C --> H[Score Ranking]
H --> I[Top Chunks]
D --> J[LLM Generation]
J --> K[Contextual Response]
`
๐ง Configuration
The RAG system is highly configurable. You can customize every aspect of its behavior through the constructor configuration object.
$3
`javascript
import RAGSystem from 'rag-system-pgvector';
import { OpenAIEmbeddings, ChatOpenAI } from '@langchain/openai';
const rag = new RAGSystem({
// ========================================
// 1. Database Configuration (Required)
// ========================================
database: {
host: 'localhost', // Database host
port: 5432, // Database port
database: 'rag_db', // Database name
username: 'postgres', // Database user
password: 'your_password', // Database password
// Connection Pool Settings
max: 10, // Max connections in pool
min: 0, // Min connections in pool
maxUses: Infinity, // Max uses per connection
allowExitOnIdle: false, // Allow pool to close when idle
maxLifetimeSeconds: 0, // Max connection lifetime (0 = unlimited)
idleTimeoutMillis: 10000 // Idle timeout (10 seconds)
},
// ========================================
// 2. AI Provider Configuration (Required)
// ========================================
embeddings: new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'text-embedding-ada-002'
}),
llm: new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'gpt-4',
temperature: 0.7
}),
// ========================================
// 3. Embedding Configuration
// ========================================
embeddingDimensions: 1536, // Dimensions for embeddings
// OpenAI ada-002: 1536
// HuggingFace MiniLM: 384
// Anthropic: varies
// ========================================
// 4. Vector Store Configuration
// ========================================
vectorStore: {
tableName: 'document_chunks_vector',
vectorColumnName: 'embedding',
contentColumnName: 'content',
metadataColumnName: 'metadata'
},
// ========================================
// 5. Document Processing Configuration
// ========================================
processing: {
chunkSize: 1000, // Characters per chunk
chunkOverlap: 200 // Overlap between chunks
},
// ========================================
// 6. Chat History Configuration (NEW)
// ========================================
chatHistory: {
enabled: true, // Enable chat history feature
maxMessages: 20, // Max messages before management kicks in
maxTokens: 3000, // Max tokens in chat history
summarizeThreshold: 30, // Trigger summarization after N messages
keepRecentCount: 10, // Recent messages to preserve
alwaysKeepFirst: true, // Always keep conversation starter
persistSessions: true, // Store sessions in database
sessionTimeout: 3600000 // Session timeout (1 hour in ms)
}
});
await rag.initialize();
`
$3
#### 1. Database Configuration
Controls PostgreSQL connection and pool behavior:
`javascript
database: {
host: 'localhost', // Where PostgreSQL is running
port: 5432, // PostgreSQL port (default: 5432)
database: 'rag_db', // Your database name
username: 'postgres', // Database user
password: 'your_password', // User password
// Pool Settings (Advanced)
max: 10, // Maximum concurrent connections
min: 0, // Minimum idle connections
idleTimeoutMillis: 10000 // Close idle connections after 10s
}
`
Best Practices:
- Use environment variables for sensitive data
- Set max based on your application's concurrency needs
- Monitor connection pool usage in production
#### 2. AI Provider Configuration
Specify your embedding and language model providers:
OpenAI Example:
`javascript
import { OpenAIEmbeddings, ChatOpenAI } from '@langchain/openai';
embeddings: new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'text-embedding-ada-002'
}),
llm: new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'gpt-4',
temperature: 0.7
})
`
Anthropic Example:
`javascript
import { OpenAIEmbeddings } from '@langchain/openai';
import { ChatAnthropic } from '@langchain/anthropic';
embeddings: new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
modelName: 'text-embedding-ada-002'
}),
llm: new ChatAnthropic({
anthropicApiKey: process.env.ANTHROPIC_API_KEY,
modelName: 'claude-3-sonnet-20240229',
temperature: 0.7
})
`
Local Models Example:
`javascript
import { HuggingFaceTransformersEmbeddings } from '@langchain/community/embeddings/hf_transformers';
import { Ollama } from '@langchain/community/llms/ollama';
embeddings: new HuggingFaceTransformersEmbeddings({
modelName: 'sentence-transformers/all-MiniLM-L6-v2'
}),
llm: new Ollama({
baseUrl: 'http://localhost:11434',
model: 'llama2'
})
`
#### 3. Embedding Dimensions
Match this to your embedding model's output dimensions:
| Model | Dimensions | Provider |
|-------|------------|----------|
| text-embedding-ada-002 | 1536 | OpenAI |
| all-MiniLM-L6-v2 | 384 | HuggingFace |
| text-embedding-3-small | 1536 | OpenAI |
| text-embedding-3-large | 3072 | OpenAI |
`javascript
embeddingDimensions: 1536 // Must match your embedding model
`
Important: If you change embedding models, you must recreate the database schema!
#### 4. Vector Store Configuration
Customize the vector store table structure:
`javascript
vectorStore: {
tableName: 'document_chunks_vector', // Table name for vectors
vectorColumnName: 'embedding', // Column for embeddings
contentColumnName: 'content', // Column for text content
metadataColumnName: 'metadata' // Column for metadata
}
`
Most users can use the defaults.
#### 5. Document Processing
Control how documents are chunked:
`javascript
processing: {
chunkSize: 1000, // Characters per chunk (500-2000 recommended)
chunkOverlap: 200 // Overlap between chunks (10-20% of chunkSize)
}
`
Guidelines:
- Small chunks (500): Better precision, more chunks, higher cost
- Large chunks (2000): Better context, fewer chunks, lower cost
- Overlap: Prevents context loss at boundaries (typically 10-20%)
Examples:
`javascript
// For technical documentation (needs precision)
processing: { chunkSize: 800, chunkOverlap: 150 }
// For books/long content (needs context)
processing: { chunkSize: 1500, chunkOverlap: 300 }
// For code documentation (needs structure)
processing: { chunkSize: 1000, chunkOverlap: 200 }
`
#### 6. Chat History Configuration (NEW in v2.3.0)
Control conversation history management:
`javascript
chatHistory: {
enabled: true, // Enable/disable chat history
maxMessages: 20, // Start management after N messages
maxTokens: 3000, // Maximum tokens in history
summarizeThreshold: 30, // Summarize after N messages
keepRecentCount: 10, // Recent messages to always keep
alwaysKeepFirst: true, // Keep conversation starter
persistSessions: true, // Store in database
sessionTimeout: 3600000 // 1 hour timeout (in milliseconds)
}
`
Chat History Options Explained:
- enabled: Master switch for chat history feature
- maxMessages: Soft limit before history management activates
- maxTokens: Hard limit on token count (prevents API errors)
- summarizeThreshold: When to trigger LLM-based summarization
- keepRecentCount: Recent messages to preserve during summarization
- alwaysKeepFirst: Preserve conversation context from the beginning
- persistSessions: Save sessions to database for persistence
- sessionTimeout: Milliseconds before session is considered inactive
Preset Configurations:
`javascript
// Minimal (cost-effective)
chatHistory: {
enabled: true,
maxMessages: 10,
maxTokens: 1500,
summarizeThreshold: 15,
keepRecentCount: 5,
persistSessions: false
}
// Balanced (recommended)
chatHistory: {
enabled: true,
maxMessages: 20,
maxTokens: 3000,
summarizeThreshold: 30,
keepRecentCount: 10,
persistSessions: true
}
// Maximum context (for complex conversations)
chatHistory: {
enabled: true,
maxMessages: 40,
maxTokens: 6000,
summarizeThreshold: 50,
keepRecentCount: 20,
persistSessions: true
}
// Disabled (for single-shot queries)
chatHistory: {
enabled: false
}
`
$3
Create a .env file for sensitive configuration:
`env
Database
DB_HOST=localhost
DB_PORT=5432
DB_NAME=rag_db
DB_USER=postgres
DB_PASSWORD=your_secure_password
OpenAI
OPENAI_API_KEY=sk-...
Anthropic (optional)
ANTHROPIC_API_KEY=sk-ant-...
Azure (optional)
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://...
Processing (optional)
CHUNK_SIZE=1000
CHUNK_OVERLAP=200
EMBEDDING_DIMENSIONS=1536
`
Then use in your code:
`javascript
import 'dotenv/config';
const rag = new RAGSystem({
database: {
host: process.env.DB_HOST,
port: parseInt(process.env.DB_PORT),
database: process.env.DB_NAME,
username: process.env.DB_USER,
password: process.env.DB_PASSWORD
},
embeddings: new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY
}),
llm: new ChatOpenAI({
openAIApiKey: process.env.OPENAI_API_KEY
}),
embeddingDimensions: parseInt(process.env.EMBEDDING_DIMENSIONS || '1536')
});
`
$3
You can also configure behavior at query time:
`javascript
const result = await rag.query('Your question', {
// Filtering
userId: 'user_123', // Filter by user
knowledgebotId: 'bot_456', // Filter by bot
filter: { category: 'tech' }, // Custom metadata filters
// Retrieval
limit: 10, // Number of chunks to retrieve
threshold: 0.5, // Similarity threshold (0-1)
// Chat History
chatHistory: previousHistory, // Previous conversation
maxHistoryLength: 15, // Override default history length
sessionId: 'session_789', // Session identifier
persistSession: true, // Save session to database
// Context
context: additionalContext, // Extra context to include
metadata: { source: 'api' } // Custom metadata
});
`
$3
1. Security: Never hardcode API keys or passwords
2. Environment-Specific: Use different configs for dev/staging/prod
3. Performance: Monitor and adjust based on usage patterns
4. Cost: Balance context size with API costs
5. Testing: Test with different configurations to find optimal settings
๐ Performance Optimization
$3
The system creates optimized indexes:
`sql
-- For vector similarity search
CREATE INDEX idx_document_chunks_embedding
ON document_chunks USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- For document relationships
CREATE INDEX idx_document_chunks_document_id
ON document_chunks(document_id);
`
$3
- Recursive Character Text Splitter: Preserves semantic boundaries
- Configurable overlap: Ensures context continuity
- Multiple separators: Prioritizes paragraph, sentence, then word boundaries
๐งช Testing
$3
`bash
Create test documents directory
mkdir test-docs
Add some test files (PDF, DOCX, TXT, etc.)
Then process them
npm run process-docs ./test-docs
`
$3
`bash
Interactive search
npm run search
Or single query
npm run search "What is machine learning?"
`
๐ Troubleshooting
$3
1. pgvector extension not found
`sql
-- Install pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
`
2. OpenAI API quota exceeded
- Check your OpenAI API usage
- Consider using alternative embedding models
3. Large document processing fails
- Increase chunk size or reduce document size
- Check memory limits
4. Poor search results
- Lower similarity threshold
- Adjust chunk size and overlap
- Verify document content quality
$3
Enable verbose logging by setting:
`env
NODE_ENV=development
``