Enterprise-grade LLM mock server for local and CI: scenarios, faults, latency, contracts, VCR. Supports standalone server and Express middleware.
npm install llm-emulatorLLM Emulator is an enterprise-grade, deterministic, fully offline emulator for LLM providers such as OpenAI, Gemini, and Ollama.
It enables full-stack automated testing—CI, integration tests, E2E flows, multi-agent orchestration flows, and local development—without hitting real LLM APIs, without API keys, and without nondeterministic model drift.
The star of the system is Scenario Graphs: branching, stateful, multi-step scripted interactions that emulate how your LLM-powered agents and workflows behave in production.
Other features include:
- Linear scenarios
- Case-based prompt → response mocking
- HTTP downstream API mocks (for your REST dependencies)
- Fault injection
- Delays
- JSON-schema contract validation
- VCR request recording
- Express middleware integration
---
1. Overview
2. Installation
3. Quick Start
4. Scenario Graphs
5. Linear Scenarios
6. Case-Based Prompt Mocks
7. HTTP Mocking
8. Provider Compatibility
9. Fault Injection
10. Delays
11. Contract Validation
12. VCR Recording
13. Express Middleware
14. CLI Reference
15. Full DSL & Config Documentation
16. License
---
Applications today rely on LLM outputs for:
- multi-step conversations
- agent tool calls
- chain-of-thought workflows
- structured output generation
- code generation
- orchestration logic
- multi-agent routing
This makes local testing, CI, and E2E automation incredibly fragile unless you have:
- deterministic outputs
- reproducible flows
- fast execution
- offline capability
- stateful multi-turn interactions
LLM Emulator provides all of this.
---
```
npm install llm-emulator --save-dev
Or use npx:
``
npx llm-emulator ./mocks/config.mjs
---
`js
import { define, scenario, caseWhen, httpGet } from "llm-emulator";
export default define({
server: { port: 11434 },
useScenario: "checkout-graph",
scenarios: [
scenario("checkout-graph", {
start: "collect-name",
steps: {
"collect-name": {
branches: [
{
when: "my name is {{name}}",
if: ({ name }) => name.toLowerCase().includes("declined"),
reply: "Your application is declined.",
next: "end-declined",
},
{
when: "my name is {{name}}",
if: ({ name }) => name.toLowerCase().includes("approved"),
reply: "Your application is approved!",
next: "end-approved",
},
{
when: "my name is {{name}}",
reply: ({ vars }) =>
Thanks ${vars.name}, what's your address?,
next: "collect-address",
},
],
},
"collect-address": {
branches: [
{
when: "my address is {{address}}",
reply: ({ vars }) =>
We will contact you at ${vars.address}.,
next: "end-pending",
},
],
},
"end-declined": { final: true },
"end-approved": { final: true },
"end-pending": { final: true },
},
}),
],
cases: [
caseWhen("explain {{topic}} simply", ({ topic }) =>
Simple explanation of ${topic}.
),
],
httpMocks: [
httpGet("/api/user/:id", ({ params }) => ({
id: params.id,
name: "Mock User",
})),
],
defaults: {
fallback: "No mock available.",
},
});
`
Run it:
``
npx llm-emulator ./config.mjs --scenario checkout-graph
---
Scenario Graphs are the primary way to emulate multi-step LLM-driven workflows.
A scenario consists of:
- start: the initial state IDsteps
- : a mapping of state IDs to state definitionswhen
- each state contains one or more branches
- each branch defines:
- a pattern ()if
- optional guard ()reply
- reply ()next
- next state ()delayMs
- optional delay ()result
- optional tool result ()kind
- optional type (: "chat" or "tools")
`jsHello ${vars.name}. Your address?
scenario("checkout-graph", {
start: "collect-name",
steps: {
"collect-name": {
branches: [
{
when: "my name is {{name}}",
if: ({ name }) => name.toLowerCase().includes("declined"),
reply: "Declined.",
next: "end-declined",
},
{
when: "my name is {{name}}",
if: ({ name }) => name.toLowerCase().includes("approved"),
reply: "Approved!",
next: "end-approved",
},
{
when: "my name is {{name}}",
reply: ({ vars }) => ,
next: "collect-address",
},
],
},
"collect-address": {
branches: [
{
when: "my address is {{address}}",
reply: ({ vars }) =>
Thanks. We'll mail you at ${vars.address}.,
next: "end-pending",
},
],
},
"end-declined": { final: true },
"end-approved": { final: true },
"end-pending": { final: true },
},
});
`
- Multi-turn conversation emulation
- Conditional routing
- Stateful flows
- Dynamic replies
- Tool-style responses
- Terminal states
- Deterministic behavior
---
For simple ordered scripts:
`js`
scenario("simple-linear", {
steps: [
{ kind: "chat", reply: "Welcome" },
{ kind: "chat", reply: "Next" }
]
});
These run top-to-bottom.
---
Direct LLM prompt → response mocking:
`jsSummary of ${topic}
caseWhen("summarize {{topic}}", ({ topic }) =>
`
);
Pattern matching supports:
- Template variables {{var}}
- Looser lexical matching
- Optional fuzzy matching fallback
---
Mock downstream REST calls:
`js
httpGet("/api/user/:id", ({ params }) => ({
id: params.id,
name: "Mock User",
}));
httpPost("/api/checkout", ({ body }) => ({
status: "ok",
orderId: "mock123",
}));
`
Works with:
- GET
- POST
- PUT
- DELETE
- Path params (:id)
- Query params
- JSON body parsing
---
LLM Emulator exposes mock endpoints identical to real providers.
``
POST /v1/chat/completions
POST /chat/completions
POST /v1/responses
POST /responses
POST /v1/embeddings
Embeddings return deterministic fake vectors.
``
POST /v1/models/:model:generateContent
POST /v1alpha/models/:model:generateContent
POST /v1beta/models/:model:generateContent
``
POST /api/generate
---
Faults can be attached to any:
- branch
- case
- HTTP mock
Examples:
`js`
fault: { type: "timeout" }
fault: { type: "http", status: 503 }
fault: { type: "malformed-json" }
fault: { type: "stream-glitch" }
---
Simulate real-world latency.
``
server: { delayMs: 200 }
``
delayMs: 500
``
httpGet("/x", { delayMs: 300 })
---
Optional JSON-schema validation using Ajv.
Modes:
``
contracts: {
mode: "strict" | "warn" | "off"
}
Validates:
- OpenAI request/response
- Gemini request/response
- Ollama request/response
---
Capture all incoming requests:
``
npx llm-emulator ./config.mjs --record ./recordings
Produces .jsonl files containing:
- timestamp
- provider
- request JSON
- response JSON
Perfect for test reproducibility and debugging.
---
Mount the emulator in an existing server:
`js
import { createLlmEmulatorRouter } from "llm-emulator";
const emulator = await createLlmEmulatorRouter("./config.mjs");
app.use("/llm-emulator", emulator.express());
`
Now you can point you OpenAI or Gemini application to this route.
---
`
npx llm-emulator ./config.mjs [options]
--scenario
--record
--port
--verbose
`
---
| Field | Description |
|-------|-------------|
| server.port | Port to run mock provider |server.delayMs
| | Global delay |useScenario
| | Active scenario ID |scenarios[]
| | Scenario definitions |cases[]
| | Case mocks |httpMocks[]
| | HTTP mocks |defaults.fallback
| | Default response text |
---
``
scenario(id, {
start: "state",
steps: {
"state": {
branches: [ ... ]
},
"end": { final: true }
}
})
| Field | Description |
|-------|-------------|
| when | Pattern with template vars |if(vars, ctx)
| | Optional guard |reply
| | String or function |kind
| | "chat" or "tools" |result
| | For tool-style replies |next
| | Next state ID |delayMs
| | Per-branch delay |fault
| | Fault injection config |
---
``
scenario(id, {
steps: [
{ kind, reply, result, delayMs, fault }
]
})
---
``
caseWhen(pattern, handler)
---
``
httpGet(path, handler)
httpPost(path, handler)
httpPut(path, handler)
httpDelete(path, handler)
Handler receives:
```
{ params, query, body, headers }
Supports per-route:
- delays
- faults
- dynamic replies
MIT