Customized transcript viewer for bloom evals based on @kaifronsdal/transcript-viewer
npm install @isha-gpt/bloom-viewerA web-based viewer for AI conversation transcripts with support for rollbacks, branching conversations, and detailed analysis. Customized for bloom evaluations with metajudge reporting, score visualization, and hierarchical evaluation suite organization.
Based on @kaifronsdal/transcript-viewer.
- Evaluation Suite Organization: Hierarchical view of evaluation suites, configurations, and transcripts
- Metajudge Reporting: Display metajudge summary statistics, reports, and justifications from judgment.json files
- Color-Coded Scores: Translucent color scheme (white → yellow → orange → red) for score visualization (1-10 scale)
- Clickable Quote References: Judge justifications with numbered references that jump to quoted messages in transcripts
- Advanced Filtering: Filter transcripts by scores, tags, and custom expressions
- Conversation Branching: Support for rollbacks and branching conversation paths
- Dark/Light Mode: Theme toggle with persistent preference
The viewer expects transcripts to be organized in a hierarchical structure:
```
transcripts/
├── evaluation_suite_1/
│ ├── config_1/
│ │ ├── transcript_001.json
│ │ ├── transcript_002.json
│ │ └── judgment.json # Optional metajudge data
│ └── config_2/
│ ├── transcript_003.json
│ └── judgment.json
└── evaluation_suite_2/
└── config_3/
├── transcript_004.json
└── judgment.json
#### Transcript Files (transcript_*.json)metadata
Each transcript file should follow the standard transcript schema with:
- : Contains id, auditor_model, target_model, judge_output (with scores and highlights)events
- : Array of conversation events (messages, tool calls, decisions, rollbacks, etc.)
Example structure:
`json`
{
"metadata": {
"id": "001",
"auditor_model": "openrouter/anthropic/claude-3.5-sonnet",
"target_model": "openrouter/google/gemini-2.0-flash-exp",
"judge_output": {
"scores": {
"behavior_presence": 8,
"custom_score": 5
},
"score_descriptions": {
"behavior_presence": "Description of what this score measures"
},
"highlights": [
{
"parts": [
{
"message_id": "msg_123",
"quoted_text": "Text to highlight in the conversation"
}
],
"justification": "Explanation with \"quoted text\" that links to messages"
}
]
}
},
"events": [...]
}
#### Judgment Files (judgment.json)
Optional metajudge data for each configuration:
`json`
{
"summary_statistics": {
"average_behavior_presence_score": 7.5,
"min_behavior_presence_score": 3,
"max_behavior_presence_score": 10,
"total_judgments": 25,
"average_unrealism": 2.3,
"average_evaluator_forcefulness": 4.1,
"average_evaluation_awareness": 6.8,
"average_evaluator_inconsistency": 1.9,
"average_instruction_evasion": 3.2,
"average_bugs": 1.1,
"meta_diversity_score": 7
},
"metajudgment_response": "Overall assessment text with XML tags",
"metajudgment_justification": "Detailed justification for the metajudgment"
}
The easiest way to use the viewer is via npx:
`bash`
npx @isha-gpt/bloom-viewer@latest --dir /path/to/your/transcripts
This will automatically download the latest version, build indexes for your transcripts, and start the server.
Available CLI options:
- --dir - Path to transcript directory (required)--port
- - Port to run the server on (default: 3000)--host
- - Host address (default: localhost, use 0.0.0.0 for remote access)-f, --force
- - Force rebuild of the production bundle--no-open
- - Don't automatically open the browser
Example with custom settings:
`bash`
npx @isha-gpt/bloom-viewer@latest --dir ./transcripts --port 8080 --host 0.0.0.0
Note on first startup: When you first start the server, it will automatically build indexes for all transcript configurations. This process scans your transcript directory and creates lightweight index files (stored as _index.json in each config directory) that dramatically improve page load performance. For large datasets (e.g., 20,000+ transcripts across 34 configs), this initial indexing may take 30-60 seconds. Subsequent server restarts will use the cached indexes. If you reorganize folders, the indexes move with them automatically.
bash
Clone the repository
git clone
cd bloom-viewerInstall dependencies
npm installSet the transcript directory (optional, defaults to ./transcripts)
export TRANSCRIPT_DIR=/path/to/your/transcriptsStart the development server
npm run dev
`The viewer will be available at
http://localhost:3000.$3
`bash
Development
npm run dev # Start dev server on localhost:3000
npm run dev:runpod # Start dev server on 0.0.0.0:3000 (for remote environments)Building and Preview
npm run build # Build for production
npm run preview # Preview production build on localhost:3000
npm run start:runpod # Preview production build on 0.0.0.0:8080Code Quality
npm run check # Run svelte-check for type checking
npm run lint # Run prettier and eslint checks
npm run format # Format code with prettier
`$3
-
TRANSCRIPT_DIR: Path to transcript directory (defaults to ./transcripts)
- PORT: Server port (defaults to 3000)
- HOST: Server host (defaults to localhost)Development
$3
- Framework: SvelteKit 5 with Svelte 5 runes
- Styling: Tailwind CSS 4 + DaisyUI 5
- Server: Node adapter with Express for CLI mode
- Data Structures: TanStack Table + TanStack Virtual for virtualization$3
#### Score Color Scheme
Scores are color-coded using a translucent gradient:
- 1-2.5: White with gray border
- 2.6-5: Translucent yellow
- 5.1-7.5: Translucent orange
- 7.6-10: Translucent red
The color logic is centralized in
src/lib/shared/score-utils.ts.#### Metajudge Integration
Metajudge data is loaded on-demand when configurations are expanded. The viewer:
1. Reads
judgment.json from each config directory
2. Displays summary statistics as color-coded badges
3. Shows metajudge report and justification in collapsible sections
4. Strips XML tags and duplicate content from responses#### Clickable Quote References
Judge justifications can contain quoted text that links to specific messages:
1. Quotes in justification text are matched against
highlights in transcript metadata
2. Numbered reference links [1], [2], etc. are added after quotes
3. Clicking a reference scrolls to and highlights the corresponding message in the conversation$3
`
src/
├── lib/
│ ├── client/ # Client-side code
│ │ ├── components/ # Svelte components
│ │ │ ├── common/ # Reusable UI components
│ │ │ ├── table/ # Table-specific components
│ │ │ └── transcript/ # Transcript viewer components
│ │ ├── utils/ # Client utilities
│ │ └── stores.ts # Global state management
│ ├── server/ # Server-side code
│ │ ├── cache/ # Transcript caching system
│ │ ├── data-loading/ # Bulk loading and scanning
│ │ └── core/ # Individual transcript loading
│ └── shared/ # Shared utilities and types
│ ├── types.ts # TypeScript type definitions
│ ├── transcript-parser.ts # Event parsing logic
│ └── score-utils.ts # Score color scheme
└── routes/
├── api/ # API endpoints
│ ├── transcripts/ # Transcript data endpoints
│ └── judgment/ # Metajudge data endpoints
└── transcript/[...path]/ # Dynamic transcript viewing
``