Core vector search engine for code intelligence
npm install @claude-vector/coreCore vector search engine for semantic code search. This package provides the fundamental building blocks for creating embeddings-based search systems.
- 🚀 High-performance vector similarity search
- 💾 Built-in caching system
- 🔧 Configurable chunk processing
- 📁 Smart project analysis
- 🎯 Multiple embedding model support
- 🔄 Extensible architecture
``bash`
npm install @claude-vector/core
`javascript
import { VectorSearchEngine, createDefaultConfig } from '@claude-vector/core';
// Create search engine with default config
const config = createDefaultConfig();
const searchEngine = new VectorSearchEngine(config);
// Initialize and search
await searchEngine.initialize('./your-project');
const results = await searchEngine.search('function definition', { limit: 5 });
console.log(results);
`
Set your OpenAI API key:
`bash`
export OPENAI_API_KEY="sk-your-api-key-here"
Or create a .env file:
`env`
OPENAI_API_KEY=sk-your-api-key-here
The ProjectAdapter helps analyze your project structure and generate appropriate configurations:
`javascript
import { ProjectAdapter } from '@claude-vector/core';
const adapter = new ProjectAdapter('/path/to/project');
// Analyze project type and structure
const projectInfo = await adapter.analyzeProject();
// { type: 'nextjs', language: 'typescript', framework: 'next', ... }
// Get optimized configuration for your project
const config = await adapter.getConfig();
// Get all files matching the configuration
const files = await adapter.getFiles();
`
`javascript`
{
search: {
threshold: 0.7, // Minimum similarity score (0-1)
maxResults: 10, // Maximum results to return
includeMetadata: true
},
embeddings: {
model: 'text-embedding-3-small',
batchSize: 100,
dimensions: 1536
},
chunks: {
maxSize: 1000, // Maximum tokens per chunk
minSize: 100, // Minimum tokens per chunk
overlap: 200, // Token overlap between chunks
splitByParagraph: true,
preserveCodeBlocks: true
},
cache: {
enabled: true,
ttl: 3600, // Cache TTL in seconds
compression: true
}
}
Create a .claude-search.config.js in your project root:
`javascript`
export default {
patterns: {
include: ['src//.{js,ts}', 'docs//.md'],
exclude: ['/.test.js', '/__tests__/*']
},
chunks: {
maxSize: 1500,
overlap: 300
},
search: {
threshold: 0.8
}
};
#### Constructor Options
- openaiApiKey (string): OpenAI API keyembeddingModel
- (string): Model to use for embeddingssearchThreshold
- (number): Minimum similarity score (0-1)maxResults
- (number): Maximum results to returncacheEnabled
- (boolean): Enable/disable cachingcacheTTL
- (number): Cache time-to-live in seconds
#### Methods
##### loadIndex(embeddingsPath, chunksPath)
Load pre-computed embeddings and chunks from JSON files.
##### search(query, options)
Search for similar chunks using semantic similarity.
##### findRelated(chunkIndex, options)
Find chunks similar to a given chunk.
##### generateQueryEmbedding(query)
Generate embedding vector for a query string.
##### getStats()
Get index statistics including chunk count, token count, and size estimates.
#### Methods
##### analyzeProject()
Analyze project structure and detect type, framework, and features.
##### getDefaultConfig()
Get default configuration based on project type.
##### loadCustomConfig()
Load custom configuration from project config files.
##### getConfig()
Get merged configuration (default + custom).
##### getFiles(config)
Get all files matching the include/exclude patterns.
The built-in cache system helps improve performance by storing search results:
`javascript
import { SimpleCache } from '@claude-vector/core';
const cache = new SimpleCache('./cache', 3600); // 1 hour TTL
// Basic operations
await cache.set('key', { data: 'value' });
const value = await cache.get('key');
await cache.delete('key');
// Maintenance
await cache.cleanup(); // Remove expired entries
const stats = await cache.getStats(); // Get cache statistics
`
`javascript`
const engine = new VectorSearchEngine({
embeddingModel: 'text-embedding-3-large',
// Dimensions change based on model
config: { embeddings: { dimensions: 3072 } }
});
For large codebases, process embeddings in batches:
`javascript`
const config = {
embeddings: {
batchSize: 50, // Process 50 chunks at a time
maxRetries: 3,
retryDelay: 2000
}
};
TypeScript users can benefit from JSDoc type definitions:
`typescript``
import type {
SearchOptions,
SearchResult,
ProjectConfig
} from '@claude-vector/core';
1. Pre-compute embeddings: Generate embeddings once and reuse them
2. Enable caching: Cache search results for repeated queries
3. Optimize chunk size: Balance between context and performance
4. Use appropriate models: Smaller models for speed, larger for accuracy
MIT