Interactive CLI for learning LLM evaluations - like Gemini CLI but for evals
npm install @ankurpm/evalschool> Master LLM Evaluations in Your Terminal
An interactive CLI for learning LLM evaluations hands-on. Built like Gemini CLI and Claude CLI — beautiful terminal UI, step-by-step lessons, live exercises, and progress tracking.
```
╔═════════════════════════════════════════════════════════════════╗
║ ║
║ EVALSCHOOL ║
║ Master LLM Evaluations ║
║ ║
╚═════════════════════════════════════════════════════════════════╝
`bashInstall globally
npm install -g @ankurpm/evalschool
Features
- 12 Interactive Lessons — Comprehensive curriculum across 4 modules
- Live Exercises — Run real Gemini API calls and see evaluation results
- Quizzes — Test your understanding after each topic
- Progress Tracking — Pick up where you left off
- Offline Mode — Works without API key using pre-computed examples
- Beautiful TUI — Polished terminal interface with colors, tables, and diagrams
Installation
$3
`bash
npm install -g @ankurpm/evalschool
evalschool
`$3
`bash
Add the tap and install
brew tap ankshvayt/evalschool
brew install evalschoolStart learning
evalschool
`Updating:
`bash
brew update && brew upgrade evalschool
`Uninstalling:
`bash
brew uninstall evalschool
brew untap ankshvayt/evalschool
`$3
`bash
git clone https://github.com/ankshvayt/evalschool.git
cd evalschool
npm install
npm run build
npm link
evalschool
`Usage
$3
`bash
evalschool
`Launches the full interactive experience with module selection, lessons, quizzes, and exercises.
$3
| Command | Description |
|---------|-------------|
|
evalschool | Interactive menu |
| evalschool lesson | Jump to specific lesson (e.g., 1.1, 2.3) |
| evalschool setup | Configure Gemini API key |
| evalschool progress | View completion status |
| evalschool list | List all available lessons |
| evalschool --help | Show help |
| evalschool --version | Show version |$3
`bash
Start from the beginning
evalschool lesson 1.1Jump to RAG evaluation
evalschool lesson 3.1Check your progress
evalschool progressConfigure API for live exercises
evalschool setup
`Curriculum
$3
| Lesson | Topic |
|--------|-------|
| 1.1 | Introduction to LLM Evaluations |
| 1.2 | Environment Setup & Configuration |
| 1.3 | Basic Evaluation Metrics |$3
| Lesson | Topic |
|--------|-------|
| 2.1 | Semantic Similarity & G-Eval |
| 2.2 | Hallucination Detection |
| 2.3 | Bias & Toxicity Detection |$3
| Lesson | Topic |
|--------|-------|
| 3.1 | RAG-Specific Metrics |
| 3.2 | End-to-End RAG Evaluation |$3
| Lesson | Topic |
|--------|-------|
| 4.1 | Multi-Turn Conversation Evaluation |
| 4.2 | Custom Metrics Development |
| 4.3 | CI/CD Integration |
| 4.4 | A/B Testing for LLMs |API Key Setup
Many exercises include live API calls to demonstrate real evaluations. Get your free Gemini API key:
1. Visit Google AI Studio
2. Click "Get API Key"
3. Copy your key
4. Run
evalschool setup and paste itOr set via environment variable:
`bash
export GEMINI_API_KEY=your_key_here
`> Note: The CLI works without an API key using pre-computed examples, but live exercises provide the best learning experience.
What You'll Learn
By completing all lessons, you'll master:
- ✅ LLM evaluation fundamentals and why they matter
- ✅ Answer relevancy, coherence, and fluency metrics
- ✅ Semantic similarity using embeddings
- ✅ G-Eval methodology with LLM judges
- ✅ Hallucination detection and mitigation
- ✅ Bias and toxicity scanning
- ✅ RAG-specific metrics (faithfulness, precision, recall)
- ✅ Multi-turn conversation evaluation
- ✅ Building custom evaluation metrics
- ✅ CI/CD integration with GitHub Actions
- ✅ A/B testing prompts with statistical rigor
Tech Stack
- TypeScript — Type-safe codebase
- Commander — CLI argument parsing
- Chalk — Terminal colors
- Conf — Local configuration storage
- @google/generative-ai — Gemini API SDK
Project Structure
`
evalschool-cli/
├── src/
│ ├── index.ts # Entry point
│ ├── cli.ts # Commander setup
│ ├── core/
│ │ ├── terminal.ts # TUI utilities
│ │ ├── config.ts # Progress & settings
│ │ └── gemini.ts # API integration
│ ├── ui/
│ │ └── menu.ts # Interactive menus
│ └── lessons/
│ ├── index.ts # Lesson registry
│ ├── m1_intro.ts # Module 1 lessons
│ ├── m2_semantic.ts # Module 2 lessons
│ ├── m3_rag_metrics.ts # Module 3 lessons
│ └── m4_*.ts # Module 4 lessons
├── package.json
├── tsconfig.json
└── README.md
`Development
`bash
Install dependencies
npm installBuild
npm run buildRun in development
npm run devWatch mode
npm run watchLink globally for testing
npm link
`Contributing
Contributions welcome! Please read our contributing guidelines before submitting PRs.
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run
npm run build` to verify- DeepEval Documentation
- Google AI Studio
- LLM Evaluation Best Practices
MIT © ankshvayt
---
Happy Evaluating! 🚀