Modular AI orchestration framework for multiple LLM providers
npm install @ai-orchestration/coreA modular and extensible framework for orchestrating multiple AI/LLM providers consistently and configurable.
š¦ This is an npm package: API keys must be configured in the project that uses this package (using environment variables or .env files in that project), not in the package itself.
- š Plugin-based architecture: Add new providers or strategies without modifying the core
- šÆ Multiple selection strategies: Round-robin, priority, fallback, weighted, health-aware
- š Native streaming: Full support for streaming responses using ReadableStream
- š Automatic fallback: Automatically tries multiple providers if one fails
- š Health checks: Provider health monitoring with latency metrics
- š¦ Runtime agnostic: Compatible with Node.js and Bun
- šØ Declarative API: Simple configuration via JSON/JS objects
- š Type-safe: Fully typed with TypeScript
``bash`
npm install @ai-orchestration/core
This package supports both ESM (ECMAScript Modules) and CommonJS, so you can use it in any Node.js project:
ESM Projects (recommended):
`typescript`
import { createOrchestrator } from '@ai-orchestration/core';
CommonJS Projects:
`javascript`
const { createOrchestrator } = require('@ai-orchestration/core');
The package automatically exports the correct format based on your project's module system.
`typescript
import { createOrchestrator } from '@ai-orchestration/core';
// API keys should come from environment variables configured in YOUR project
// Example: export GROQ_API_KEY="your-key" or using dotenv in your project
const orchestrator = createOrchestrator({
providers: [
{
id: 'groq-1',
type: 'groq',
apiKey: process.env.GROQ_API_KEY!, // Configure this variable in your project
model: 'llama-3.3-70b-versatile',
},
{
id: 'openrouter-1',
type: 'openrouter',
apiKey: process.env.OPENROUTER_API_KEY!,
model: 'openai/gpt-3.5-turbo',
},
],
strategy: {
type: 'round-robin',
},
});
// Simple chat
const response = await orchestrator.chat([
{ role: 'user', content: 'Hello, world!' },
]);
console.log(response.content);
// Streaming chat
const stream = await orchestrator.chatStream([
{ role: 'user', content: 'Tell me a story' },
]);
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
process.stdout.write(value.content);
}
`
`typescript
import {
Orchestrator,
RoundRobinStrategy,
GroqProvider,
OpenRouterProvider,
} from '@ai-orchestration/core';
// Create strategy
const strategy = new RoundRobinStrategy();
// Create orchestrator
const orchestrator = new Orchestrator(strategy);
// Register providers
// API keys should come from environment variables configured in YOUR project
orchestrator.registerProvider(
new GroqProvider({
id: 'groq-1',
apiKey: process.env.GROQ_API_KEY!,
})
);
orchestrator.registerProvider(
new OpenRouterProvider({
id: 'openrouter-1',
apiKey: process.env.OPENROUTER_API_KEY!,
})
);
// Use
const response = await orchestrator.chat([
{ role: 'user', content: 'Hello!' },
]);
`
Cycles through providers in order:
`typescript`
{
strategy: {
type: 'round-robin',
},
}
Selects providers based on priority (lower number = higher priority):
`typescript`
{
strategy: {
type: 'priority',
priorities: {
'groq-1': 1,
'openrouter-1': 2,
'gemini-1': 3,
},
},
}
Tries providers in order until one works:
`typescript`
{
strategy: {
type: 'fallback',
order: ['groq-1', 'openrouter-1', 'gemini-1'],
},
}
Selection based on weights (useful for load balancing):
`typescript`
{
strategy: {
type: 'weighted',
weights: {
'groq-1': 0.7,
'openrouter-1': 0.3,
},
},
}
Considers cost per token:
`typescript`
{
strategy: {
type: 'weighted',
costAware: true,
weights: {
'groq-1': 1.0,
'openrouter-1': 1.0,
},
},
}
Selects based on health metrics (latency, success rate):
`typescript`
{
strategy: {
type: 'health-aware',
preferLowLatency: true,
minHealthScore: 0.5,
},
}
`typescript`
{
id: 'groq-1',
type: 'groq',
apiKey: 'your-api-key',
model: 'llama-3.3-70b-versatile', // optional, default
baseURL: 'https://api.groq.com/openai/v1', // optional
}
`typescript`
{
id: 'openrouter-1',
type: 'openrouter',
apiKey: 'your-api-key',
model: 'openai/gpt-3.5-turbo', // optional
baseURL: 'https://openrouter.ai/api/v1', // optional
}
`typescript`
{
id: 'gemini-1',
type: 'gemini',
apiKey: 'your-api-key',
model: 'gemini-pro', // optional
baseURL: 'https://generativelanguage.googleapis.com/v1beta', // optional
}
Cerebras Inference API - OpenAI compatible. Documentation: inference-docs.cerebras.ai
`typescript`
{
id: 'cerebras-1',
type: 'cerebras',
apiKey: 'your-api-key', // Get at: https://inference-docs.cerebras.ai
model: 'llama-3.3-70b', // optional, default
baseURL: 'https://api.cerebras.ai/v1', // optional
}
Note: Cerebras API requires the User-Agent header to avoid CloudFront blocking. This is included automatically.
For local models that expose an OpenAI-compatible API:
`typescript`
{
id: 'local-1',
type: 'local',
baseURL: 'http://localhost:8000',
model: 'local-model', // optional
apiKey: 'optional-key', // optional
}
`typescript`
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
maxRetries: 3, // Maximum retry attempts (default: number of providers)
requestTimeout: 30000, // Global timeout in milliseconds (default: 30000)
retryDelay: 'exponential', // or number in milliseconds (default: 1000)
});
Automatically disable providers after consecutive failures:
`typescript`
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
circuitBreaker: {
enabled: true,
failureThreshold: 5, // Open circuit after 5 failures
resetTimeout: 60000, // Reset after 60 seconds
},
});
Enhanced health check configuration:
`typescript`
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
healthCheck: {
enabled: true,
interval: 60000, // Check every 60 seconds
timeout: 5000, // Health check timeout (default: 5000ms)
maxConsecutiveFailures: 3, // Mark unhealthy after 3 failures (default: 3)
latencyThreshold: 10000, // Max latency in ms (default: 10000ms)
},
// Legacy format still supported:
// enableHealthChecks: true,
// healthCheckInterval: 60000,
});
Or manually check health:
`typescript`
const health = await provider.checkHealth();
console.log(health.healthy, health.latency);
`typescript`
const response = await orchestrator.chat(messages, {
temperature: 0.7,
maxTokens: 1000,
topP: 0.9,
topK: 40,
stopSequences: ['\n\n'],
responseLanguage: 'es', // Force response in Spanish
frequencyPenalty: 0.5, // Reduce repetition
presencePenalty: 0.3, // Encourage new topics
seed: 42, // For reproducible outputs
timeout: 30000, // Request timeout in milliseconds
user: 'user-123', // User identifier for tracking
});
- temperature: Controls randomness (0.0 to 2.0)
- maxTokens: Maximum tokens in response
- topP: Nucleus sampling threshold
- topK: Top-K sampling
- stopSequences: Stop generation on these sequences
- responseLanguage: Force response language (see below)
- frequencyPenalty: Penalize frequent tokens (-2.0 to 2.0)
- presencePenalty: Penalize existing tokens (-2.0 to 2.0)
- seed: Seed for reproducible outputs
- timeout: Request timeout in milliseconds (overrides global timeout)
- user: User identifier for tracking/rate limiting
You can force the AI to respond in a specific language using the responseLanguage option:
`typescript
// Using ISO 639-1 language codes
const response = await orchestrator.chat(messages, {
responseLanguage: 'es', // Spanish
// or 'en', 'fr', 'de', 'it', 'pt', 'ja', 'zh', 'ru', etc.
});
// Using full language names
const response2 = await orchestrator.chat(messages, {
responseLanguage: 'spanish', // Also works
// or 'english', 'french', 'german', 'italian', etc.
});
`
How it works: When responseLanguage is specified, the framework automatically prepends a system message instructing the model to respond in the specified language. If you already have a system message, the language instruction will be prepended to it.
Supported languages: Spanish, English, French, German, Italian, Portuguese, Japanese, Chinese, Russian, Korean, Arabic, Hindi, Dutch, Polish, Swedish, Turkish (and more via ISO 639-1 codes).
Track provider usage, costs, and strategy effectiveness:
`typescript
const orchestrator = createOrchestrator({
providers: [...],
strategy: {...},
enableMetrics: true, // Enabled by default
onMetricsEvent: (event) => {
// Optional: Real-time event tracking
console.log('Event:', event.type, event.providerId);
},
});
// Make some requests...
// Get overall metrics
const metrics = orchestrator.getMetrics().getOrchestratorMetrics();
console.log('Total Requests:', metrics.totalRequests);
console.log('Total Cost:', metrics.totalCost);
console.log('Error Rate:', metrics.errorRate);
// Get provider-specific metrics
const providerMetrics = orchestrator.getMetrics().getProviderMetrics('groq-1');
console.log('Provider Requests:', providerMetrics?.totalRequests);
console.log('Provider Cost:', providerMetrics?.totalCost);
console.log('Success Rate:', providerMetrics?.successfulRequests / providerMetrics?.totalRequests);
// Get strategy metrics
const strategyMetrics = orchestrator.getMetrics().getStrategyMetrics();
console.log('Selections by Provider:', strategyMetrics.selectionsByProvider);
console.log('Average Selection Time:', strategyMetrics.averageSelectionTime);
`
- Provider Metrics: Requests, success/failure rates, latency, token usage, costs
- Strategy Metrics: Selection counts, distribution, selection time
- Overall Metrics: Total requests, costs, error rates, requests per minute
- Request History: Detailed history with filtering options
See examples/metrics.ts for a complete example.
`typescript
import { BaseProvider } from '@ai-orchestration/core';
import type {
ChatMessage,
ChatOptions,
ChatResponse,
ChatChunk,
ProviderHealth,
ProviderMetadata,
} from '@ai-orchestration/core';
export class CustomProvider extends BaseProvider {
readonly id: string;
readonly metadata: ProviderMetadata;
constructor(config: CustomConfig) {
super();
this.id = config.id;
this.metadata = {
id: this.id,
name: 'Custom Provider',
};
}
async checkHealth(): Promise
// Implement health check
}
async chat(messages: ChatMessage[], options?: ChatOptions): Promise
// Implement chat
}
async chatStream(messages: ChatMessage[], options?: ChatOptions): Promise
// Implement streaming
}
protected formatMessages(messages: ChatMessage[]): unknown {
// Convert standard format to provider format
}
protected parseResponse(response: unknown): ChatResponse {
// Convert provider response to standard format
}
protected parseStream(stream: ReadableStream
// Convert provider stream to standard format
}
}
`
`typescript
import { BaseStrategy } from '@ai-orchestration/core';
import type { AIService, SelectionContext } from '@ai-orchestration/core';
export class CustomStrategy extends BaseStrategy {
async select(
providers: AIService[],
context?: SelectionContext
): Promise
// Implement selection logic
return providers[0];
}
update?(provider: AIService, success: boolean, metadata?: unknown): void {
// Optional: update internal state
}
}
`
``
src/
āāā core/
ā āāā interfaces.ts # Main interfaces
ā āāā types.ts # Shared types
ā āāā orchestrator.ts # Orchestrator core
ā āāā errors.ts # Custom error classes
āāā providers/
ā āāā base.ts # Base class for providers
ā āāā groq.ts
ā āāā openrouter.ts
ā āāā gemini.ts
ā āāā cerebras.ts
ā āāā local.ts
āāā strategies/
ā āāā base.ts # Base class for strategies
ā āāā round-robin.ts
ā āāā priority.ts
ā āāā fallback.ts
ā āāā weighted.ts
ā āāā health-aware.ts
āāā factory/
ā āāā index.ts # Factory for declarative creation
āāā index.ts # Main entry point
- Single Responsibility: Each class has a single responsibility
- Open/Closed Principle: Extensible without modifying the core
- Plugin-based Architecture: Providers and strategies are plugins
- Composition over Inheritance: Preference for composition
- Configuration over Hard-coding: Declarative configuration
- Declarative APIs: Simple and expressive APIs
`bashInstall dependencies
npm install
$3
#### Quick Test (No API Keys Required)
Test the framework with mock providers without needing API keys:
`bash
npm run test:mock
`#### Test with Real Providers
Note: The
@ai-orchestration/core package does not include .env files. Environment variables must be configured in your project or in the examples.1. Set environment variables:
`bash
export GROQ_API_KEY="your-key"
export OPENROUTER_API_KEY="your-key"
export GEMINI_API_KEY="your-key"
export CEREBRAS_API_KEY="your-key"
`2. Run tests:
`bash
npm run test:local
`$3
#### Method 1: npm link (Recommended)
`bash
In this directory (ai-orchestration)
npm run linkIn your other project
npm link @ai-orchestration/core
`Now you can import normally:
`typescript
import { createOrchestrator } from '@ai-orchestration/core';
`#### Method 2: npm pack
`bash
In this directory
npm run pack:localIn your other project
npm install ./@ai-orchestration-core-0.1.0.tgz
`Requirements
- Node.js: >= 18.0.0 (for native ReadableStream and test runner)
- TypeScript: 5.3+ (already included in devDependencies)
Examples
See the
examples/ directory for more code examples:
- basic.ts - Basic usage example
- strategies.ts - Strategy examples
- test-local.ts - Testing with real providers
- test-mock.ts - Testing with mock providers
- chat-app/` - Full chat application exampleMIT
See CONTRIBUTING.md for guidelines on contributing to this project.
- ARCHITECTURE.md - Detailed architecture documentation
- CHANGELOG.md - Version history and changes