Edge-native unified LLM provider - pure fetch API, minimal dependencies (zod), WebStreams, memory optimization for Cloudflare Workers and edge computing environments
npm install @aid-on/unillm


unillm is a unified LLM interface for edge computing. It provides a consistent, type-safe API across multiple LLM providers with minimal dependencies and optimized memory usage for edge environments.
日本語 | English
- 🚀 Edge-First: ~50KB bundle size, ~10ms cold start, optimized for edge runtimes
- 🔄 Unified Interface: Single API for Anthropic, OpenAI, Groq, Gemini, Cloudflare, and more
- 🌊 Streaming Native: Built on Web Streams API with nagare integration
- 🎯 Type-Safe: Full TypeScript support with Zod schema validation
- 📦 Minimal Dependencies: Only Zod (~11KB) required
- ⚡ Memory Optimized: Automatic chunking and backpressure handling
``bash`
npm install @aid-on/unillm
`bash`
yarn add @aid-on/unillm
`bash`
pnpm add @aid-on/unillm
`typescript
import { unillm } from "@aid-on/unillm";
// Fluent API with type safety
const response = await unillm()
.model("openai:gpt-4o-mini")
.credentials({ openaiApiKey: process.env.OPENAI_API_KEY })
.temperature(0.7)
.generate("Explain quantum computing in simple terms");
console.log(response.text);
`
unillm returns @aid-on/nagare Stream for reactive stream processing:
`typescript
import { unillm } from "@aid-on/unillm";
import type { Stream } from "@aid-on/nagare";
const stream: Stream
.model("groq:llama-3.3-70b-versatile")
.credentials({ groqApiKey: "..." })
.stream("Write a story about AI");
// Use nagare's reactive operators
const enhanced = stream
.map(chunk => chunk.trim())
.filter(chunk => chunk.length > 0)
.throttle(16) // ~60fps for UI updates
.tap(chunk => console.log(chunk))
.toSSE(); // Convert to Server-Sent Events
`
Generate type-safe structured data with Zod schemas:
`typescript
import { z } from "zod";
const PersonSchema = z.object({
name: z.string(),
age: z.number(),
skills: z.array(z.string())
});
const result = await unillm()
.model("groq:llama-3.1-8b-instant")
.credentials({ groqApiKey: "..." })
.schema(PersonSchema)
.generate("Generate a software engineer profile");
// Type-safe access
console.log(result.object.name); // string
console.log(result.object.skills); // string[]
`
Ultra-concise syntax for common models:
`typescript
import { anthropic, openai, groq, gemini, cloudflare } from "@aid-on/unillm";
// One-liners for quick prototyping
await anthropic.sonnet("sk-ant-...").generate("Hello");
await openai.mini("sk-...").generate("Hello");
await groq.instant("gsk_...").generate("Hello");
await gemini.flash("AIza...").generate("Hello");
await cloudflare.llama({ accountId: "...", apiToken: "..." }).generate("Hello");
`
- Claude Opus 4.5 (Most Intelligent)
- anthropic:claude-haiku-4-5-20251001 - Claude Haiku 4.5 (Ultra Fast)
- anthropic:claude-sonnet-4-5-20250929 - Claude Sonnet 4.5 (Best for Coding)
- anthropic:claude-opus-4-1-20250805 - Claude Opus 4.1
- anthropic:claude-opus-4-20250514 - Claude Opus 4
- anthropic:claude-sonnet-4-20250514 - Claude Sonnet 4
- anthropic:claude-3-5-haiku-20241022 - Claude 3.5 Haiku
- anthropic:claude-3-haiku-20240307 - Claude 3 Haiku$3
- openai:gpt-4o - GPT-4o (Latest, fastest GPT-4)
- openai:gpt-4o-mini - GPT-4o Mini (Cost-effective)
- openai:gpt-4o-2024-11-20 - GPT-4o November snapshot
- openai:gpt-4o-2024-08-06 - GPT-4o August snapshot
- openai:gpt-4-turbo - GPT-4 Turbo (High capability)
- openai:gpt-4-turbo-preview - GPT-4 Turbo Preview
- openai:gpt-4 - GPT-4 (Original)
- openai:gpt-3.5-turbo - GPT-3.5 Turbo (Fast & cheap)
- openai:gpt-3.5-turbo-0125 - GPT-3.5 Turbo Latest$3
- groq:llama-3.3-70b-versatile - Llama 3.3 70B Versatile
- groq:llama-3.1-8b-instant - Llama 3.1 8B Instant
- groq:meta-llama/llama-guard-4-12b - Llama Guard 4 12B
- groq:openai/gpt-oss-120b - GPT-OSS 120B
- groq:openai/gpt-oss-20b - GPT-OSS 20B
- groq:groq/compound - Groq Compound
- groq:groq/compound-mini - Groq Compound Mini$3
- gemini:gemini-3-pro-preview - Gemini 3 Pro Preview
- gemini:gemini-3-flash-preview - Gemini 3 Flash Preview
- gemini:gemini-2.5-pro - Gemini 2.5 Pro
- gemini:gemini-2.5-flash - Gemini 2.5 Flash
- gemini:gemini-2.0-flash - Gemini 2.0 Flash
- gemini:gemini-2.0-flash-lite - Gemini 2.0 Flash Lite
- gemini:gemini-1.5-pro-002 - Gemini 1.5 Pro 002
- gemini:gemini-1.5-flash-002 - Gemini 1.5 Flash 002$3
- cloudflare:@cf/meta/llama-4-scout-17b-16e-instruct - Llama 4 Scout
- cloudflare:@cf/meta/llama-3.3-70b-instruct-fp8-fast - Llama 3.3 70B FP8
- cloudflare:@cf/meta/llama-3.1-70b-instruct - Llama 3.1 70B
- cloudflare:@cf/meta/llama-3.1-8b-instruct-fast - Llama 3.1 8B Fast
- cloudflare:@cf/meta/llama-3.1-8b-instruct - Llama 3.1 8B
- cloudflare:@cf/openai/gpt-oss-120b - GPT-OSS 120B
- cloudflare:@cf/openai/gpt-oss-20b - GPT-OSS 20B
- cloudflare:@cf/ibm/granite-4.0-h-micro - IBM Granite 4.0
- cloudflare:@cf/mistralai/mistral-small-3.1-24b-instruct - Mistral Small 3.1
- cloudflare:@cf/mistralai/mistral-7b-instruct-v0.2 - Mistral 7B
- cloudflare:@cf/google/gemma-3-12b-it - Gemma 3 12B
- cloudflare:@cf/qwen/qwq-32b - QwQ 32B
- cloudflare:@cf/qwen/qwen2.5-coder-32b-instruct - Qwen 2.5 CoderAdvanced Usage
$3
`typescript
const builder = unillm()
.model("groq:llama-3.3-70b-versatile")
.credentials({ groqApiKey: "..." })
.temperature(0.7)
.maxTokens(1000)
.topP(0.9)
.system("You are a helpful assistant")
.messages([
{ role: "user", content: "Previous question..." },
{ role: "assistant", content: "Previous answer..." }
]);// Reusable configuration
const response1 = await builder.generate("New question");
const response2 = await builder.stream("Another question");
`$3
Automatic memory management for edge environments:
`typescript
import { createMemoryOptimizedStream } from "@aid-on/unillm";const stream = await createMemoryOptimizedStream(
largeResponse,
{
maxMemory: 1024 * 1024, // 1MB limit
chunkSize: 512 // Optimal chunk size
}
);
`$3
`typescript
import { UnillmError, RateLimitError } from "@aid-on/unillm";try {
const response = await unillm()
.model("groq:llama-3.3-70b-versatile")
.credentials({ groqApiKey: "..." })
.generate("Hello");
} catch (error) {
if (error instanceof RateLimitError) {
console.log(
Rate limited. Retry after ${error.retryAfter}ms);
} else if (error instanceof UnillmError) {
console.log(LLM error: ${error.message});
}
}
`Integration Examples
$3
`typescript
import { useState } from "react";
import { unillm } from "@aid-on/unillm";export default function ChatComponent() {
const [response, setResponse] = useState("");
const [loading, setLoading] = useState(false);
const handleGenerate = async () => {
setLoading(true);
const stream = await unillm()
.model("groq:llama-3.1-8b-instant")
.credentials({ groqApiKey: import.meta.env.VITE_GROQ_API_KEY })
.stream("Write a haiku");
for await (const chunk of stream) {
setResponse(prev => prev + chunk);
}
setLoading(false);
};
return (
{response}
);
}
`$3
`typescript
export default {
async fetch(request: Request, env: Env) {
const stream = await unillm()
.model("cloudflare:@cf/meta/llama-3.1-8b-instruct")
.credentials({
accountId: env.CF_ACCOUNT_ID,
apiToken: env.CF_API_TOKEN
})
.stream("Hello from the edge!");
return new Response(stream.toReadableStream(), {
headers: { "Content-Type": "text/event-stream" }
});
}
};
`API Reference
$3
| Method | Description | Example |
|--------|-------------|---------|
|
model(id) | Set the model ID | model("groq:llama-3.3-70b-versatile") |
| credentials(creds) | Set API credentials | credentials({ groqApiKey: "..." }) |
| temperature(n) | Set temperature (0-1) | temperature(0.7) |
| maxTokens(n) | Set max tokens | maxTokens(1000) |
| topP(n) | Set top-p sampling | topP(0.9) |
| schema(zod) | Set output schema | schema(PersonSchema) |
| system(text) | Set system prompt | system("You are...") |
| messages(msgs) | Set message history | messages([...]) |
| generate(prompt) | Generate response | await generate("Hello") |
| stream(prompt) | Stream response | await stream("Hello")` |MIT