A generic, type-safe wrapper around the OpenAI API. It abstracts away the boilerplate (parsing, retries, caching, logging) while allowing raw access when needed.
npm install llm-fnsA generic, type-safe wrapper around the OpenAI API. It abstracts away the boilerplate (parsing, retries, caching, logging) while allowing raw access when needed.
Designed for power users who need to switch between simple string prompts and complex, resilient agentic workflows.
``bash`
npm install openai zod cache-manager p-queue ajv
The createLlm factory bundles all functionality (Basic, Retry, Zod) into a single client.
`typescript
import OpenAI from 'openai';
import { createLlm } from './src';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const llm = createLlm({
openai,
defaultModel: 'google/gemini-3-pro-preview',
// optional:
// cache: Cache instance (cache-manager)
// queue: PQueue instance for concurrency control
// maxConversationChars: number (auto-truncation)
// defaultRequestOptions: { headers, timeout, signal }
});
`
---
The defaultModel parameter accepts either a simple string or a configuration object that bundles the model name with default parameters like temperature or reasoning_effort.
`typescript
// Create a "creative" client with high temperature
const creativeWriter = createLlm({
openai,
defaultModel: {
model: 'gpt-4o',
temperature: 1.2,
frequency_penalty: 0.5
}
});
// All calls will use these defaults
await creativeWriter.promptText("Write a poem about the ocean");
// Override for a specific call
await creativeWriter.promptText("Summarize this document", {
model: { temperature: 0.2 } // Override just temperature, keeps model
});
`
For models that support extended thinking (like o1, o3, or Claude with thinking), use reasoning_effort or model-specific parameters:
`typescript
// Create a "deep thinker" client for complex reasoning tasks
const reasoner = createLlm({
openai,
defaultModel: {
model: 'o3',
reasoning_effort: 'high' // 'low' | 'medium' | 'high'
}
});
// All calls will use extended thinking
const analysis = await reasoner.promptText("Analyze this complex problem...");
// Create a fast reasoning client for simpler tasks
const quickReasoner = createLlm({
openai,
defaultModel: {
model: 'o3-mini',
reasoning_effort: 'low'
}
});
`
A common pattern is to create multiple clients with different presets:
`typescript
// Deterministic client for structured data extraction
const extractorLlm = createLlm({
openai,
defaultModel: {
model: 'gpt-4o-mini',
temperature: 0
}
});
// Creative client for content generation
const writerLlm = createLlm({
openai,
defaultModel: {
model: 'gpt-4o',
temperature: 1.0,
top_p: 0.95
}
});
// Reasoning client for complex analysis
const analyzerLlm = createLlm({
openai,
defaultModel: {
model: 'o3',
reasoning_effort: 'medium'
}
});
// Use the appropriate client for each task
const data = await extractorLlm.promptZod(DataSchema);
const story = await writerLlm.promptText("Write a short story");
const solution = await analyzerLlm.promptText("Solve this logic puzzle...");
`
Any preset can be overridden on individual calls:
`typescript
const llm = createLlm({
openai,
defaultModel: {
model: 'gpt-4o',
temperature: 0.7
}
});
// Use defaults
await llm.promptText("Hello");
// Override model entirely
await llm.promptText("Complex task", {
model: {
model: 'o3',
reasoning_effort: 'high'
}
});
// Override just temperature (keeps default model)
await llm.promptText("Be more creative", {
temperature: 1.5
});
// Or use short form to switch models
await llm.promptText("Quick task", {
model: 'gpt-4o-mini'
});
`
---
/ llm.promptText) when you just want the answer as a string.Return Type:
Promise`typescript
// 1. Simple User Question
const ans1 = await llm.promptText("Why is the sky blue?");// 2. System Instruction + User Question
const ans2 = await llm.promptText("You are a poet", "Describe the sea");
// 3. Conversation History (Chat Bots)
const ans3 = await llm.promptText([
{ role: "user", content: "Hi" },
{ role: "assistant", content: "Ho" }
]);
`$3
Use prompt when you need the Full OpenAI Response (usage, id, choices, finish_reason) but want to use the Simple Inputs from Level 1.Return Type:
Promise`typescript
// Shortcut A: Single String -> User Message
const res1 = await llm.prompt("Why is the sky blue?");
console.log(res1.usage.total_tokens); // Access generic OpenAI properties// Shortcut B: Two Strings -> System + User
const res2 = await llm.prompt(
"You are a SQL Expert.", // System
"Write a query for users." // User
);
`$3
Use the Config Object overload for absolute control. This allows you to mix Standard OpenAI flags with Library flags.Input Type:
LlmPromptOptions`typescript
const res = await llm.prompt({
// Standard OpenAI params
messages: [{ role: "user", content: "Hello" }],
temperature: 1.5,
frequency_penalty: 0.2,
max_tokens: 100,
// Library Extensions
model: "gpt-4o", // Override default model for this call
retries: 5, // Retry network errors 5 times
// Request-level options (headers, timeout, abort signal)
requestOptions: {
headers: { 'X-Cache-Salt': 'v2' }, // Affects cache key
timeout: 60000,
signal: abortController.signal
}
});
`---
Use Case 2: Images (
llm.promptImage)Generates an image and returns it as a
Buffer. This handles the fetching of the URL or Base64 decoding automatically.Return Type:
Promise`typescript
// 1. Simple Generation
const buffer1 = await llm.promptImage("A cyberpunk cat");// 2. Advanced Configuration (Model & Aspect Ratio)
const buffer2 = await llm.promptImage({
messages: "A cyberpunk cat",
model: "dall-e-3", // Override default model
size: "1024x1024", // OpenAI specific params pass through
quality: "hd"
});
// fs.writeFileSync('cat.png', buffer2);
`---
Use Case 3: Structured Data (
llm.promptJson & llm.promptZod)This is a high-level wrapper that employs a Re-asking Loop. If the LLM outputs invalid JSON or data that fails the schema validation, the client automatically feeds the error back to the LLM and asks it to fix it (up to
maxRetries).Return Type:
Promise$3
Use this if you have a standard JSON Schema object (e.g. from another library or API) and don't want to use Zod. It uses AJV internally to validate the response against the schema.`typescript
const MySchema = {
type: "object",
properties: {
sentiment: { type: "string", enum: ["positive", "negative"] },
score: { type: "number" }
},
required: ["sentiment", "score"],
additionalProperties: false
};const result = await llm.promptJson(
[{ role: "user", content: "I love this!" }],
MySchema
);
`$3
This is syntactic sugar over promptJson. It converts your Zod schema to JSON Schema and automatically sets up the validator to throw formatted Zod errors for the retry loop.Return Type:
Promise`typescript
import { z } from 'zod';
const UserSchema = z.object({ name: z.string(), age: z.number() });// 1. Schema Only (Hallucinate data)
const user = await llm.promptZod(UserSchema);
// 2. Extraction (Context + Schema)
const email = "Meeting at 2 PM with Bob.";
const event = await llm.promptZod(email, z.object({ time: z.string(), who: z.string() }));
// 3. Full Control (History + Schema + Options)
const history = [
{ role: "user", content: "I cast Fireball." },
{ role: "assistant", content: "It misses." }
];
const gameState = await llm.promptZod(
history, // Arg 1: Context
GameStateSchema, // Arg 2: Schema
{ // Arg 3: Options Override
model: "google/gemini-flash-1.5",
disableJsonFixer: true, // Turn off the automatic JSON repair agent
maxRetries: 0, // Fail immediately on error
}
);
`$3
Sometimes LLMs output data that is almost correct (e.g., strings for numbers). You can sanitize data before validation runs.`typescript
const result = await llm.promptZod(MySchema, {
// Transform JSON before validation runs
beforeValidation: (data) => {
if (data.price && typeof data.price === 'string') {
return { ...data, price: parseFloat(data.price) };
}
return data;
},
// Toggle usage of 'response_format: { type: "json_object" }'
useResponseFormat: false
});
`$3
You can throw
SchemaValidationError inside Zod .transform() or .refine() to trigger the retry loop. This is useful for complex validation logic that can't be expressed in the schema itself.`typescript
import { z } from 'zod';
import { SchemaValidationError } from './src';const ProductSchema = z.object({
name: z.string(),
price: z.number(),
currency: z.string()
}).transform((data) => {
// Custom validation that triggers retry
if (data.price < 0) {
throw new SchemaValidationError(
Price cannot be negative. Got: ${data.price}. Please provide a valid positive price.
);
}
// Normalize currency
const validCurrencies = ['USD', 'EUR', 'GBP'];
if (!validCurrencies.includes(data.currency.toUpperCase())) {
throw new SchemaValidationError(
Invalid currency "${data.currency}". Must be one of: ${validCurrencies.join(', ')}
);
}
return {
...data,
currency: data.currency.toUpperCase()
};
});// If the LLM returns { price: -10, ... }, the error message is sent back
// and the LLM gets another chance to fix it
const product = await llm.promptZod("Extract product info from: ...", ProductSchema);
`Important: Only
SchemaValidationError triggers the retry loop. Other errors (like TypeError, database errors, etc.) will bubble up immediately without retry. This prevents infinite loops when there's a bug in your transform logic.`typescript
const SafeSchema = z.object({
userId: z.string()
}).transform(async (data) => {
// This error WILL trigger retry (user can fix the input)
if (!data.userId.match(/^[a-z0-9]+$/)) {
throw new SchemaValidationError(
Invalid userId format "${data.userId}". Must be lowercase alphanumeric.
);
}
// This error will NOT trigger retry (it's a system error)
const user = await db.findUser(data.userId);
if (!user) {
throw new Error(User not found: ${data.userId}); // Bubbles up immediately
}
return { ...data, user };
});
`---
Use Case 4: Agentic Retry Loops (
llm.promptTextRetry)The library exposes the "Conversational Retry" engine used internally by
promptZod. You can provide a validate function. If it throws a LlmRetryError, the error message is fed back to the LLM, and it tries again.Return Type:
Promise (or generic )`typescript
import { LlmRetryError } from './src';const poem = await llm.promptTextRetry({
messages: "Write a haiku about coding.",
maxRetries: 3,
validate: async (text, info) => {
// 'info' contains history and attempt number
// info: { attemptNumber: number, conversation: [...], mode: 'main'|'fallback' }
if (!text.toLowerCase().includes("bug")) {
// This message goes back to the LLM:
// User: "Please include the word 'bug'."
throw new LlmRetryError("Please include the word 'bug'.", 'CUSTOM_ERROR');
}
return text;
}
});
`---
Use Case 5: Iterative Refinement (
createIterativeRefiner)For complex tasks where an LLM needs to "try, check, and fix" its own output (like code generation or complex configuration), use the
IterativeRefiner.It implements a loop of: Generate -> Evaluate -> Refine.
`typescript
import { createIterativeRefiner } from './src';const refiner = createIterativeRefiner({
// 1. Generate a configuration based on input and history
generate: async (input, history) => {
// history contains previous attempts and feedback
// You can construct a prompt that includes this history
const messages = [
{ role: 'system', content: 'Generate a valid SQL query.' },
{ role: 'user', content: input }
];
// Append history to help the LLM learn from mistakes
// Note: The refiner manages history automatically, but you must include it in your prompt
return await llm.promptZod([...messages, ...history], SqlSchema);
},
// 2. Evaluate the result (and optionally execute it)
evaluate: async (input, generated) => {
try {
// Example: Execute the query to see if it works
const results = await db.query(generated.sql);
if (results.length === 0) {
return { success: false, feedback: "Query returned no results. Try relaxing constraints." };
}
return { success: true };
} catch (err) {
return { success: false, feedback:
SQL Error: ${err.message} };
}
},
maxRetries: 5
});const result = await refiner.run("Find active users in New York");
if (result.success) {
console.log("Success:", result.generated);
} else {
console.log("Failed after max retries");
console.log("Last Feedback:", result.feedback);
}
// Inspect the journey
console.log(
Iterations: ${result.iterations});
console.log(result.history);
console.log(result.evaluations); // Array of all evaluation results
`---
Error Handling
The library provides a structured error hierarchy that preserves the full context of failures, whether they happen during a retry loop or cause an immediate crash.
Error Types
$3
Thrown by your validation logic to signal that the current attempt failed but can be retried. The error message is sent back to the LLM as feedback.`typescript
import { LlmRetryError } from './src';throw new LlmRetryError(
"The response must include a title field.", // Message sent to LLM
'CUSTOM_ERROR', // Type: 'JSON_PARSE_ERROR' | 'CUSTOM_ERROR'
{ field: 'title' }, // Optional details
'{"name": "test"}' // Optional raw response
);
`$3
A specialized error for schema validation failures. Use this in Zod transforms to trigger retries.`typescript
import { SchemaValidationError } from './src';throw new SchemaValidationError("Age must be a positive number");
`$3
Thrown for unrecoverable errors. This includes:
1. API Errors (401 Unauthorized, 403 Forbidden, Context Length Exceeded).
2. Runtime errors in your validation logic (e.g., TypeError, database connection failed).
3. Validation errors when maxRetries is 0 or disableJsonFixer is true.Crucially,
LlmFatalError wraps the original error and attaches the full conversation context and raw response (if available), so you can debug what the LLM generated that caused the crash.`typescript
interface LlmFatalError {
message: string;
cause?: any; // The original error (e.g. ZodError, TypeError)
messages?: ChatCompletionMessageParam[]; // The full conversation history including retries
rawResponse?: string | null; // The raw text generated by the LLM before the crash
}
`$3
Thrown when the maximum number of retries is reached. It contains the full chain of attempt errors, allowing you to trace the evolution of the conversation.`typescript
interface LlmRetryExhaustedError {
message: string;
cause: LlmRetryAttemptError; // The last attempt error (with chain to previous)
}
`$3
Wraps a single failed attempt within the retry chain.`typescript
interface LlmRetryAttemptError {
message: string;
mode: 'main' | 'fallback';
conversation: ChatCompletionMessageParam[];
attemptNumber: number;
error: Error;
rawResponse?: string | null;
cause?: LlmRetryAttemptError; // Previous attempt's error
}
`Error Chain Structure
When retries are exhausted, the error chain looks like this:
`
LlmRetryExhaustedError
└── cause: LlmRetryAttemptError (Attempt 3)
├── error: LlmRetryError (the validation error)
├── conversation: [...] (full message history)
├── rawResponse: '{"age": "wrong3"}'
└── cause: LlmRetryAttemptError (Attempt 2)
├── ...
`Handling Errors
`typescript
import {
LlmRetryExhaustedError,
LlmFatalError
} from './src';try {
const result = await llm.promptZod(MySchema);
} catch (error) {
if (error instanceof LlmRetryExhaustedError) {
console.log('All retries failed.');
// Access the last response
console.log('Last LLM response:', error.cause.rawResponse);
}
if (error instanceof LlmFatalError) {
console.log('Crash or Fatal API Error:', error.message);
console.log('Original Cause:', error.cause);
// You always have access to what the LLM said, even if your code crashed!
console.log('LLM Response that caused crash:', error.rawResponse);
console.log('Conversation History:', error.messages);
}
}
`Extracting the Last Response
A common pattern is to extract the last LLM response from a failed operation:
`typescript
function getLastResponse(error: LlmRetryExhaustedError): string | null {
return error.cause?.rawResponse ?? null;
}function getAllResponses(error: LlmRetryExhaustedError): string[] {
const responses: string[] = [];
let attempt = error.cause;
while (attempt) {
if (attempt.rawResponse) {
responses.unshift(attempt.rawResponse); // Add to front (chronological order)
}
attempt = attempt.cause as LlmRetryAttemptError | undefined;
}
return responses;
}
`---
Use Case 6: Architecture & Composition
How to build the client manually to enable Fallback Chains and Smart Routing.
$3
This creates the underlying engine that generates Text and Images. It handles Caching and Queuing but not Zod or Retry Loops.`typescript
import { createLlmClient } from './src';// 1. Define a CHEAP model
const cheapClient = createLlmClient({
openai,
defaultModel: 'google/gemini-flash-1.5'
});
// 2. Define a STRONG model
const strongClient = createLlmClient({
openai,
defaultModel: 'google/gemini-3-pro-preview'
});
`$3
This wraps a Base Client with the "Fixer" logic. You inject the prompt function you want it to use.`typescript
import { createZodLlmClient } from './src';// A standard Zod client using only the strong model
const zodClient = createZodLlmClient({
prompt: strongClient.prompt,
isPromptCached: strongClient.isPromptCached
});
`$3
Link two clients together. If the prompt function of the first client fails (retries exhausted, refusal, or unfixable JSON), it switches to the fallbackPrompt.`typescript
const smartClient = createZodLlmClient({
// Primary Strategy: Try Cheap/Fast
prompt: cheapClient.prompt,
isPromptCached: cheapClient.isPromptCached, // Fallback Strategy: Switch to Strong/Expensive
// This is triggered if the Primary Strategy exhausts its retries or validation fails
fallbackPrompt: strongClient.prompt,
});
// Usage acts exactly like the standard client
await smartClient.promptZod(MySchema);
`---
Utilities: Message Conversion
completionToMessageConverts a raw OpenAI
ChatCompletion object into a valid ChatCompletionMessageParam (specifically an assistant message) that can be fed back into the conversation history.It handles:
- Standard text content
- Custom image attachments (OpenRouter/Custom providers)
- Custom audio attachments
`typescript
import { completionToMessage } from './src';const response = await llm.prompt("Hello"); // Returns ChatCompletion object
const message = completionToMessage(response);
// { role: 'assistant', content: '...' }
// Add to history
history.push(message);
`---
Utilities: Cache Inspection
Check if a specific prompt is already cached without making an API call (or partial cache check for Zod calls).
Return Type:
Promise`typescript
const options = { messages: "Compare 5000 files..." };// 1. Check Standard Call
if (await llm.isPromptCached(options)) {
console.log("Zero latency result available!");
}
// 2. Check Zod Call (checks exact schema + prompt combo)
if (await llm.isPromptZodCached(options, MySchema)) {
// ...
}
``