@aid-on/unillm

![npm version](https://www.npmjs.com/package/@aid-on/unillm)
![TypeScript](https://www.typescriptlang.org/)
![License: MIT](https://opensource.org/licenses/MIT)

unillm is a unified LLM interface for edge computing. It provides a consistent, type-safe API across multiple LLM providers with minimal dependencies and optimized memory usage for edge environments.

日本語 | English

Features

- 🚀 Edge-First: ~50KB bundle size, ~10ms cold start, optimized for edge runtimes
- 🔄 Unified Interface: Single API for Anthropic, OpenAI, Groq, Gemini, Cloudflare, and more
- 🌊 Streaming Native: Built on Web Streams API with nagare integration
- 🎯 Type-Safe: Full TypeScript support with Zod schema validation
- 📦 Minimal Dependencies: Only Zod (~11KB) required
- ⚡ Memory Optimized: Automatic chunking and backpressure handling

Installation

``bash npm install @aid-on/unillm`

`bash yarn add @aid-on/unillm`

`bash pnpm add @aid-on/unillm`

`Quick Start`

`typescript import { unillm } from "@aid-on/unillm";

// Fluent API with type safety const response = await unillm() .model("openai:gpt-4o-mini") .credentials({ openaiApiKey: process.env.OPENAI_API_KEY }) .temperature(0.7) .generate("Explain quantum computing in simple terms");

console.log(response.text);`

`Streaming with nagare`

unillm returns @aid-on/nagare Stream for reactive stream processing:

`typescript import { unillm } from "@aid-on/unillm"; import type { Stream } from "@aid-on/nagare";

const stream: Stream = await unillm() .model("groq:llama-3.3-70b-versatile") .credentials({ groqApiKey: "..." }) .stream("Write a story about AI");

// Use nagare's reactive operators const enhanced = stream .map(chunk => chunk.trim()) .filter(chunk => chunk.length > 0) .throttle(16) // ~60fps for UI updates .tap(chunk => console.log(chunk)) .toSSE(); // Convert to Server-Sent Events`

`Structured Output`

Generate type-safe structured data with Zod schemas:

`typescript import { z } from "zod";

const PersonSchema = z.object({ name: z.string(), age: z.number(), skills: z.array(z.string()) });

const result = await unillm() .model("groq:llama-3.1-8b-instant") .credentials({ groqApiKey: "..." }) .schema(PersonSchema) .generate("Generate a software engineer profile");

// Type-safe access console.log(result.object.name); // string console.log(result.object.skills); // string[]`

`Provider Shortcuts`

Ultra-concise syntax for common models:

`typescript import { anthropic, openai, groq, gemini, cloudflare } from "@aid-on/unillm";

// One-liners for quick prototyping await anthropic.sonnet("sk-ant-...").generate("Hello"); await openai.mini("sk-...").generate("Hello"); await groq.instant("gsk_...").generate("Hello"); await gemini.flash("AIza...").generate("Hello"); await cloudflare.llama({ accountId: "...", apiToken: "..." }).generate("Hello");`

`Supported Models (48 Models)`

`$3`

anthropic:claude-opus-4-5-20251101

 - Claude Opus 4.5 (Most Intelligent)
-

anthropic:claude-haiku-4-5-20251001

 - Claude Haiku 4.5 (Ultra Fast)
-

anthropic:claude-sonnet-4-5-20250929

 - Claude Sonnet 4.5 (Best for Coding)
-

anthropic:claude-opus-4-1-20250805

 - Claude Opus 4.1
-

anthropic:claude-opus-4-20250514

 - Claude Opus 4
-

anthropic:claude-sonnet-4-20250514

 - Claude Sonnet 4
-

anthropic:claude-3-5-haiku-20241022

 - Claude 3.5 Haiku
-

anthropic:claude-3-haiku-20240307

 - Claude 3 Haiku
$3

-

openai:gpt-4o

 - GPT-4o (Latest, fastest GPT-4)
-

openai:gpt-4o-mini

 - GPT-4o Mini (Cost-effective)
-

openai:gpt-4o-2024-11-20

 - GPT-4o November snapshot
-

openai:gpt-4o-2024-08-06

 - GPT-4o August snapshot
-

openai:gpt-4-turbo

 - GPT-4 Turbo (High capability)
-

openai:gpt-4-turbo-preview

 - GPT-4 Turbo Preview
-

openai:gpt-4

 - GPT-4 (Original)
-

openai:gpt-3.5-turbo

 - GPT-3.5 Turbo (Fast & cheap)
-

openai:gpt-3.5-turbo-0125

 - GPT-3.5 Turbo Latest
$3

-

groq:llama-3.3-70b-versatile

 - Llama 3.3 70B Versatile
-

groq:llama-3.1-8b-instant

 - Llama 3.1 8B Instant
-

groq:meta-llama/llama-guard-4-12b

 - Llama Guard 4 12B
-

groq:openai/gpt-oss-120b

 - GPT-OSS 120B
-

groq:openai/gpt-oss-20b

 - GPT-OSS 20B
-

groq:groq/compound

 - Groq Compound
-

groq:groq/compound-mini

 - Groq Compound Mini
$3

-

gemini:gemini-3-pro-preview

 - Gemini 3 Pro Preview
-

gemini:gemini-3-flash-preview

 - Gemini 3 Flash Preview
-

gemini:gemini-2.5-pro

 - Gemini 2.5 Pro
-

gemini:gemini-2.5-flash

 - Gemini 2.5 Flash
-

gemini:gemini-2.0-flash

 - Gemini 2.0 Flash
-

gemini:gemini-2.0-flash-lite

 - Gemini 2.0 Flash Lite
-

gemini:gemini-1.5-pro-002

 - Gemini 1.5 Pro 002
-

gemini:gemini-1.5-flash-002

 - Gemini 1.5 Flash 002
$3

-

cloudflare:@cf/meta/llama-4-scout-17b-16e-instruct

 - Llama 4 Scout
-

cloudflare:@cf/meta/llama-3.3-70b-instruct-fp8-fast

 - Llama 3.3 70B FP8
-

cloudflare:@cf/meta/llama-3.1-70b-instruct

 - Llama 3.1 70B
-

cloudflare:@cf/meta/llama-3.1-8b-instruct-fast

 - Llama 3.1 8B Fast
-

cloudflare:@cf/meta/llama-3.1-8b-instruct

 - Llama 3.1 8B
-

cloudflare:@cf/openai/gpt-oss-120b

 - GPT-OSS 120B
-

cloudflare:@cf/openai/gpt-oss-20b

 - GPT-OSS 20B
-

cloudflare:@cf/ibm/granite-4.0-h-micro

 - IBM Granite 4.0
-

cloudflare:@cf/mistralai/mistral-small-3.1-24b-instruct

 - Mistral Small 3.1
-

cloudflare:@cf/mistralai/mistral-7b-instruct-v0.2

 - Mistral 7B
-

cloudflare:@cf/google/gemma-3-12b-it

 - Gemma 3 12B
-

cloudflare:@cf/qwen/qwq-32b

 - QwQ 32B
-

cloudflare:@cf/qwen/qwen2.5-coder-32b-instruct

 - Qwen 2.5 Coder
Advanced Usage
$3

`typescript const builder = unillm() .model("groq:llama-3.3-70b-versatile") .credentials({ groqApiKey: "..." }) .temperature(0.7) .maxTokens(1000) .topP(0.9) .system("You are a helpful assistant") .messages([ { role: "user", content: "Previous question..." }, { role: "assistant", content: "Previous answer..." } ]);

// Reusable configuration const response1 = await builder.generate("New question"); const response2 = await builder.stream("Another question");`

`$3`

Automatic memory management for edge environments:

`typescript import { createMemoryOptimizedStream } from "@aid-on/unillm";

const stream = await createMemoryOptimizedStream( largeResponse, { maxMemory: 1024 * 1024, // 1MB limit chunkSize: 512 // Optimal chunk size } );`

`$3`

`typescript import { UnillmError, RateLimitError } from "@aid-on/unillm";

try { const response = await unillm() .model("groq:llama-3.3-70b-versatile") .credentials({ groqApiKey: "..." }) .generate("Hello"); } catch (error) { if (error instanceof RateLimitError) { console.log(Rate limited. Retry after ${error.retryAfter}ms); } else if (error instanceof UnillmError) { console.log(LLM error: ${error.message}); } }`

`Integration Examples`

`$3`

`typescript import { useState } from "react"; import { unillm } from "@aid-on/unillm";

export default function ChatComponent() { const [response, setResponse] = useState(""); const [loading, setLoading] = useState(false); const handleGenerate = async () => { setLoading(true); const stream = await unillm() .model("groq:llama-3.1-8b-instant") .credentials({ groqApiKey: import.meta.env.VITE_GROQ_API_KEY }) .stream("Write a haiku"); for await (const chunk of stream) { setResponse(prev => prev + chunk); } setLoading(false); }; return (


      
      {response}


  );
}

$3

`typescript export default { async fetch(request: Request, env: Env) { const stream = await unillm() .model("cloudflare:@cf/meta/llama-3.1-8b-instruct") .credentials({ accountId: env.CF_ACCOUNT_ID, apiToken: env.CF_API_TOKEN }) .stream("Hello from the edge!"); return new Response(stream.toReadableStream(), { headers: { "Content-Type": "text/event-stream" } }); } };`

`API Reference`

`$3`

| Method | Description | Example | |--------|-------------|---------| |model(id) | Set the model ID | model("groq:llama-3.3-70b-versatile")| |credentials(creds) | Set API credentials | credentials({ groqApiKey: "..." })| |temperature(n) | Set temperature (0-1) | temperature(0.7)| |maxTokens(n) | Set max tokens | maxTokens(1000)| |topP(n) | Set top-p sampling | topP(0.9)| |schema(zod) | Set output schema | schema(PersonSchema)| |system(text) | Set system prompt | system("You are...")| |messages(msgs) | Set message history | messages([...])| |generate(prompt) | Generate response | await generate("Hello")| |stream(prompt) | Stream response | await stream("Hello")` |

License

MIT

@aid-on/unillm

![npm version](https://www.npmjs.com/package/@aid-on/unillm)
![TypeScript](https://www.typescriptlang.org/)
![License: MIT](https://opensource.org/licenses/MIT)

日本語 | English

Features

Installation

``bash npm install @aid-on/unillm`

`bash yarn add @aid-on/unillm`

`bash pnpm add @aid-on/unillm`

`Quick Start`

`typescript import { unillm } from "@aid-on/unillm";

console.log(response.text);`

`Streaming with nagare`

unillm returns @aid-on/nagare Stream for reactive stream processing:

`typescript import { unillm } from "@aid-on/unillm"; import type { Stream } from "@aid-on/nagare";

const stream: Stream = await unillm() .model("groq:llama-3.3-70b-versatile") .credentials({ groqApiKey: "..." }) .stream("Write a story about AI");

`Structured Output`

Generate type-safe structured data with Zod schemas:

`typescript import { z } from "zod";

const PersonSchema = z.object({ name: z.string(), age: z.number(), skills: z.array(z.string()) });

const result = await unillm() .model("groq:llama-3.1-8b-instant") .credentials({ groqApiKey: "..." }) .schema(PersonSchema) .generate("Generate a software engineer profile");

// Type-safe access console.log(result.object.name); // string console.log(result.object.skills); // string[]`

`Provider Shortcuts`

Ultra-concise syntax for common models:

`typescript import { anthropic, openai, groq, gemini, cloudflare } from "@aid-on/unillm";

`Supported Models (48 Models)`

`$3`

anthropic:claude-opus-4-5-20251101

 - Claude Opus 4.5 (Most Intelligent)
-

anthropic:claude-haiku-4-5-20251001

 - Claude Haiku 4.5 (Ultra Fast)
-

anthropic:claude-sonnet-4-5-20250929

 - Claude Sonnet 4.5 (Best for Coding)
-

anthropic:claude-opus-4-1-20250805

 - Claude Opus 4.1
-

anthropic:claude-opus-4-20250514

 - Claude Opus 4
-

anthropic:claude-sonnet-4-20250514

 - Claude Sonnet 4
-

anthropic:claude-3-5-haiku-20241022

 - Claude 3.5 Haiku
-

anthropic:claude-3-haiku-20240307

 - Claude 3 Haiku
$3

-

openai:gpt-4o

 - GPT-4o (Latest, fastest GPT-4)
-

openai:gpt-4o-mini

 - GPT-4o Mini (Cost-effective)
-

openai:gpt-4o-2024-11-20

 - GPT-4o November snapshot
-

openai:gpt-4o-2024-08-06

 - GPT-4o August snapshot
-

openai:gpt-4-turbo

 - GPT-4 Turbo (High capability)
-

openai:gpt-4-turbo-preview

 - GPT-4 Turbo Preview
-

openai:gpt-4

 - GPT-4 (Original)
-

openai:gpt-3.5-turbo

 - GPT-3.5 Turbo (Fast & cheap)
-

openai:gpt-3.5-turbo-0125

 - GPT-3.5 Turbo Latest
$3

-

groq:llama-3.3-70b-versatile

 - Llama 3.3 70B Versatile
-

groq:llama-3.1-8b-instant

 - Llama 3.1 8B Instant
-

groq:meta-llama/llama-guard-4-12b

 - Llama Guard 4 12B
-

groq:openai/gpt-oss-120b

 - GPT-OSS 120B
-

groq:openai/gpt-oss-20b

 - GPT-OSS 20B
-

groq:groq/compound

 - Groq Compound
-

groq:groq/compound-mini

 - Groq Compound Mini
$3

-

gemini:gemini-3-pro-preview

 - Gemini 3 Pro Preview
-

gemini:gemini-3-flash-preview

 - Gemini 3 Flash Preview
-

gemini:gemini-2.5-pro

 - Gemini 2.5 Pro
-

gemini:gemini-2.5-flash

 - Gemini 2.5 Flash
-

gemini:gemini-2.0-flash

 - Gemini 2.0 Flash
-

gemini:gemini-2.0-flash-lite

 - Gemini 2.0 Flash Lite
-

gemini:gemini-1.5-pro-002

 - Gemini 1.5 Pro 002
-

gemini:gemini-1.5-flash-002

 - Gemini 1.5 Flash 002
$3

-

cloudflare:@cf/meta/llama-4-scout-17b-16e-instruct

 - Llama 4 Scout
-

cloudflare:@cf/meta/llama-3.3-70b-instruct-fp8-fast

 - Llama 3.3 70B FP8
-

cloudflare:@cf/meta/llama-3.1-70b-instruct

 - Llama 3.1 70B
-

cloudflare:@cf/meta/llama-3.1-8b-instruct-fast

 - Llama 3.1 8B Fast
-

cloudflare:@cf/meta/llama-3.1-8b-instruct

 - Llama 3.1 8B
-

cloudflare:@cf/openai/gpt-oss-120b

 - GPT-OSS 120B
-

cloudflare:@cf/openai/gpt-oss-20b

 - GPT-OSS 20B
-

cloudflare:@cf/ibm/granite-4.0-h-micro

 - IBM Granite 4.0
-

cloudflare:@cf/mistralai/mistral-small-3.1-24b-instruct

 - Mistral Small 3.1
-

cloudflare:@cf/mistralai/mistral-7b-instruct-v0.2

 - Mistral 7B
-

cloudflare:@cf/google/gemma-3-12b-it

 - Gemma 3 12B
-

cloudflare:@cf/qwen/qwq-32b

 - QwQ 32B
-

cloudflare:@cf/qwen/qwen2.5-coder-32b-instruct

 - Qwen 2.5 Coder
Advanced Usage
$3

// Reusable configuration const response1 = await builder.generate("New question"); const response2 = await builder.stream("Another question");`

`$3`

Automatic memory management for edge environments:

`typescript import { createMemoryOptimizedStream } from "@aid-on/unillm";

const stream = await createMemoryOptimizedStream( largeResponse, { maxMemory: 1024 * 1024, // 1MB limit chunkSize: 512 // Optimal chunk size } );`

`$3`

`typescript import { UnillmError, RateLimitError } from "@aid-on/unillm";

`Integration Examples`

`$3`

`typescript import { useState } from "react"; import { unillm } from "@aid-on/unillm";


      
      {response}


  );
}

$3

`API Reference`

`$3`

License

MIT