Streaming AI backend server with cost controls, rate limiting, and cancellation support.
npm install chat-nest-server> Streaming AI backend server for Chat Nest with built-in cost protection and cancellation propagation using Server-Side Events (SSE).
This package exposes an Express-compatible request handler that:
- Streams AI responses using Server-Side Events (SSE)
- Sends real-time tokens via SSE protocol
- Enforces rate limits and profile-based limits
- Supports abort propagation
- Protects against runaway usage
---
- Server-Side Events (SSE) streaming over HTTP
- Real-time token streaming via SSE protocol
- SSE event types: start, token, done, error, ping
- Heartbeat pings to keep connection alive
- End-to-end cancellation support
- Daily token limit enforcement (profile-based)
- Rate limiting
- Message trimming
- Safe retry semantics
- OpenAI adapter included
---
``bash`
npm install chat-nest-server
The handler automatically uses Server-Side Events (SSE) for streaming responses:
`
import express from "express";
import cors from "cors";
import { createChatHandler } from "chat-nest-server";
const app = express();
app.use(cors());
app.use(express.json());
app.post(
"/api/chat",
createChatHandler({
apiKey: process.env.OPENAI_API_KEY!,
})
);
app.listen(3001, () => {
console.log("API running on http://localhost:3001");
});
`
The handler sends SSE-formatted events:
- event: start\ndata: \n\n - Stream startedevent: token\ndata:
- - Each token chunkevent: done\ndata: \n\n
- - Stream completedevent: error\ndata:
- - Error occurredevent: ping\ndata: \n\n
- - Heartbeat (every 15s)
---
| Variable | Description |
| -------------- | -------------- |
| OPENAI_API_KEY | OpenAI API key |
Profiles (constrained, balanced, expanded) control limits. HARD_CAPS clamp all profiles. Server enforces:
- Maximum tokens per request – Profile-based; client can override via maxTokensPerRequest (server applies min(profile limit, client value))dailyTokenLimit
- Daily token limit – Profile-based; client can override via (same min rule)
- Request rate limiting (profile)
- Prompt size trimming (profile)
- Retry classification
| Field | Type | Description |
| ------------------- | ------ | --------------------------------------------------------------------------- |
| messages | array | Chat messages (required) |profile
| / aiUsageProfile | string | Profile: constrained, balanced, expanded (legacy: budget, moderate, free) |dailyTokenLimit
| | number | Optional. Cap daily tokens; server uses min(profile limit, this) |maxTokensPerRequest
| | number | Optional. Cap tokens per request; server uses min(profile limit, this) |
This prevents accidental overspending and abuse.
This package uses SSE protocol for efficient streaming:
- Content-Type: text/event-streamkeep-alive
- Connection: no-cache
- Cache-Control: event:
- Heartbeat: Ping every 15 seconds to keep connection alive
- Event Format:
SSE provides better efficiency and real-time streaming compared to traditional polling or chunked responses.
---
- config/profiles.ts – AI usage profiles (constrained, balanced, expanded), HARD_CAPS, and resolveProfile(). Frontend may send aiUsageProfile string; backend resolves safely.config/aiLimits.ts
- – AI_MODEL` only. All limits come from profiles.
---
ISC