LLM eval & testing toolkit
npm install promptfoo
promptfoo is a developer-friendly local tool for testing LLM applications. Stop the trial-and-error approach - start shipping secure, reliable AI apps.
Website ยท
Getting Started ยท
Red Teaming ยท
Documentation ยท
Discord
``shInstall and initialize project
npx promptfoo@latest init
See Getting Started (evals) or Red Teaming (vulnerability scanning) for more.
- Test your prompts and models with automated evaluations
- Secure your LLM apps with red teaming and vulnerability scanning
- Compare models side-by-side (OpenAI, Anthropic, Azure, Bedrock, Ollama, and more)
- Automate checks in CI/CD
- Review pull requests for LLM-related security and compliance issues with code scanning
- Share results with your team
Here's what it looks like in action:
!prompt evaluation matrix - web viewer
It works on the command line too:
!prompt evaluation matrix - command line
It also can generate security vulnerability reports:
- ๐ Developer-first: Fast, with features like live reload and caching
- ๐ Private: LLM evals run 100% locally - your prompts never leave your machine
- ๐ง Flexible: Works with any LLM API or programming language
- ๐ช Battle-tested: Powers LLM apps serving 10M+ users in production
- ๐ Data-driven: Make decisions based on metrics, not gut feel
- ๐ค Open source: MIT licensed, with an active community
- ๐ Full Documentation
- ๐ Red Teaming Guide
- ๐ฏ Getting Started
- ๐ป CLI Usage
- ๐ฆ Node.js Package
- ๐ค Supported Models
- ๐ฌ Code Scanning Guide
We welcome contributions! Check out our contributing guide to get started.
Join our Discord community for help and discussion.