A easy to use CLI to generate JSONL datasets from a TXT file using LLMs.
npm install @teichai/datagenA easy to use CLI to generate JSONL datasets from a TXT file using LLMs.
``bash`
npm i -g @teichai/datagen
Or install locally and run via npx:
`bash`
npm i -D @teichai/datagen
npx datagen --help
Run tests:
`bash`
npm test
Set your OpenRouter API key:
`bash`
export API_KEY="your_openrouter_key"
Create a prompts file where each line is a prompt:
`text`
Explain the CAP theorem in simple terms.
Write a Python function to reverse a linked list.
Run:
`bash`
datagen --model openai/gpt-4o-mini --prompts prompts.txt
Note: On startup, datagen does a quick best-effort check for a newer npm version and prints an upgrade command if available. Disable with DATAGEN_DISABLE_UPDATE_CHECK=1.
Development (build + run once):
`bash`
API_KEY="your_openrouter_key" npm run dev -- --model openai/gpt-4o-mini --prompts prompts.txt
- --help: show the help message and exit.--version
- : print the CLI version and exit.--config
- : set a config file--model
- : required model name.--prompts
- : required prompts file.--out
- : output JSONL (default dataset.jsonl).--api
- : API base (default OpenRouter).--system
- : optional system prompt.--store-system true|false
- : store system message in output (default true).--concurrent
- : number of in-flight requests (default 1).--openrouter.provider
- : comma-separated provider slugs to try in order (OpenRouter only).--openrouter.providerSort
- : provider routing sort (OpenRouter only).--reasoningEffort
- : pass through as reasoning.effort.--no-progress`: disable the progress bar.
-