bloom Evals - Transcript Viewer

A web-based viewer for AI conversation transcripts with support for rollbacks, branching conversations, and detailed analysis. Customized for bloom evaluations with metajudge reporting, score visualization, and hierarchical evaluation suite organization.

Based on @kaifronsdal/transcript-viewer.

Features

- Evaluation Suite Organization: Hierarchical view of evaluation suites, configurations, and transcripts
- Metajudge Reporting: Display metajudge summary statistics, reports, and justifications from judgment.json files
- Color-Coded Scores: Translucent color scheme (white → yellow → orange → red) for score visualization (1-10 scale)
- Clickable Quote References: Judge justifications with numbered references that jump to quoted messages in transcripts
- Advanced Filtering: Filter transcripts by scores, tags, and custom expressions
- Conversation Branching: Support for rollbacks and branching conversation paths
- Dark/Light Mode: Theme toggle with persistent preference

Expected Folder Structure

The viewer expects transcripts to be organized in a hierarchical structure:

``transcripts/ ├── evaluation_suite_1/ │ ├── config_1/ │ │ ├── transcript_001.json │ │ ├── transcript_002.json │ │ └── judgment.json # Optional metajudge data │ └── config_2/ │ ├── transcript_003.json │ └── judgment.json └── evaluation_suite_2/ └── config_3/ ├── transcript_004.json └── judgment.json`

`$3`

#### Transcript Files (transcript_*.json) Each transcript file should follow the standard transcript schema with: -metadata: Contains id, auditor_model, target_model, judge_output(with scores and highlights) -events: Array of conversation events (messages, tool calls, decisions, rollbacks, etc.)

Example structure:`json { "metadata": { "id": "001", "auditor_model": "openrouter/anthropic/claude-3.5-sonnet", "target_model": "openrouter/google/gemini-2.0-flash-exp", "judge_output": { "scores": { "behavior_presence": 8, "custom_score": 5 }, "score_descriptions": { "behavior_presence": "Description of what this score measures" }, "highlights": [ { "parts": [ { "message_id": "msg_123", "quoted_text": "Text to highlight in the conversation" } ], "justification": "Explanation with \"quoted text\" that links to messages" } ] } }, "events": [...] }`

#### Judgment Files (judgment.json) Optional metajudge data for each configuration:

`json { "summary_statistics": { "average_behavior_presence_score": 7.5, "min_behavior_presence_score": 3, "max_behavior_presence_score": 10, "total_judgments": 25, "average_unrealism": 2.3, "average_evaluator_forcefulness": 4.1, "average_evaluation_awareness": 6.8, "average_evaluator_inconsistency": 1.9, "average_instruction_evasion": 3.2, "average_bugs": 1.1, "meta_diversity_score": 7 }, "metajudgment_response": "Overall assessment text with XML tags", "metajudgment_justification": "Detailed justification for the metajudgment" }`

`Quick Start`

`$3`

The easiest way to use the viewer is via npx:

`bash npx @isha-gpt/bloom-viewer@latest --dir /path/to/your/transcripts`

This will automatically download the latest version, build indexes for your transcripts, and start the server.

Available CLI options: ---dir - Path to transcript directory (required) ---port - Port to run the server on (default: 3000) ---host

 - Host address (default: localhost, use 0.0.0.0 for remote access)
-

-f, --force

 - Force rebuild of the production bundle
-

--no-open - Don't automatically open the browser

Example with custom settings:`bash npx @isha-gpt/bloom-viewer@latest --dir ./transcripts --port 8080 --host 0.0.0.0`

Note on first startup: When you first start the server, it will automatically build indexes for all transcript configurations. This process scans your transcript directory and creates lightweight index files (stored as _index.json in each config directory) that dramatically improve page load performance. For large datasets (e.g., 20,000+ transcripts across 34 configs), this initial indexing may take 30-60 seconds. Subsequent server restarts will use the cached indexes. If you reorganize folders, the indexes move with them automatically.

`$3`


- Node.js 18+
- npm or pnpm
$3

bash
Clone the repository

git clone 
cd bloom-viewer
Install dependencies

npm install
Set the transcript directory (optional, defaults to ./transcripts)

export TRANSCRIPT_DIR=/path/to/your/transcripts
Start the development server

npm run dev

The viewer will be available at http://localhost:3000.

`$3`

`bash

`Development`


npm run dev                # Start dev server on localhost:3000
npm run dev:runpod        # Start dev server on 0.0.0.0:3000 (for remote environments)
Building and Preview

npm run build             # Build for production
npm run preview           # Preview production build on localhost:3000
npm run start:runpod      # Preview production build on 0.0.0.0:8080
Code Quality

npm run check             # Run svelte-check for type checking
npm run lint              # Run prettier and eslint checks
npm run format            # Format code with prettier

$3

- TRANSCRIPT_DIR: Path to transcript directory (defaults to ./transcripts) -PORT: Server port (defaults to 3000) -HOST: Server host (defaults to localhost)

`Development`

`$3`


- Framework: SvelteKit 5 with Svelte 5 runes
- Styling: Tailwind CSS 4 + DaisyUI 5
- Server: Node adapter with Express for CLI mode
- Data Structures: TanStack Table + TanStack Virtual for virtualization
$3
#### Score Color Scheme
Scores are color-coded using a translucent gradient:
- 1-2.5: White with gray border
- 2.6-5: Translucent yellow
- 5.1-7.5: Translucent orange
- 7.6-10: Translucent red

The color logic is centralized in src/lib/shared/score-utils.ts.

#### Metajudge Integration Metajudge data is loaded on-demand when configurations are expanded. The viewer: 1. Readsjudgment.jsonfrom each config directory 2. Displays summary statistics as color-coded badges 3. Shows metajudge report and justification in collapsible sections 4. Strips XML tags and duplicate content from responses

#### Clickable Quote References Judge justifications can contain quoted text that links to specific messages: 1. Quotes in justification text are matched againsthighlightsin transcript metadata 2. Numbered reference links[1], [2], etc. are added after quotes 3. Clicking a reference scrolls to and highlights the corresponding message in the conversation

`$3`

`src/ ├── lib/ │ ├── client/ # Client-side code │ │ ├── components/ # Svelte components │ │ │ ├── common/ # Reusable UI components │ │ │ ├── table/ # Table-specific components │ │ │ └── transcript/ # Transcript viewer components │ │ ├── utils/ # Client utilities │ │ └── stores.ts # Global state management │ ├── server/ # Server-side code │ │ ├── cache/ # Transcript caching system │ │ ├── data-loading/ # Bulk loading and scanning │ │ └── core/ # Individual transcript loading │ └── shared/ # Shared utilities and types │ ├── types.ts # TypeScript type definitions │ ├── transcript-parser.ts # Event parsing logic │ └── score-utils.ts # Score color scheme └── routes/ ├── api/ # API endpoints │ ├── transcripts/ # Transcript data endpoints │ └── judgment/ # Metajudge data endpoints └── transcript/[...path]/ # Dynamic transcript viewing``

bloom Evals - Transcript Viewer

Based on @kaifronsdal/transcript-viewer.

Features

Expected Folder Structure

The viewer expects transcripts to be organized in a hierarchical structure:

`$3`

#### Judgment Files (judgment.json) Optional metajudge data for each configuration:

`Quick Start`

`$3`

The easiest way to use the viewer is via npx:

`bash npx @isha-gpt/bloom-viewer@latest --dir /path/to/your/transcripts`

This will automatically download the latest version, build indexes for your transcripts, and start the server.

Available CLI options: ---dir - Path to transcript directory (required) ---port - Port to run the server on (default: 3000) ---host

 - Host address (default: localhost, use 0.0.0.0 for remote access)
-

-f, --force

 - Force rebuild of the production bundle
-

--no-open - Don't automatically open the browser

Example with custom settings:`bash npx @isha-gpt/bloom-viewer@latest --dir ./transcripts --port 8080 --host 0.0.0.0`

`$3`


- Node.js 18+
- npm or pnpm
$3

bash
Clone the repository

git clone 
cd bloom-viewer
Install dependencies

npm install
Set the transcript directory (optional, defaults to ./transcripts)

export TRANSCRIPT_DIR=/path/to/your/transcripts
Start the development server

npm run dev

The viewer will be available at http://localhost:3000.

`$3`

`bash

`Development`


npm run dev                # Start dev server on localhost:3000
npm run dev:runpod        # Start dev server on 0.0.0.0:3000 (for remote environments)
Building and Preview

npm run build             # Build for production
npm run preview           # Preview production build on localhost:3000
npm run start:runpod      # Preview production build on 0.0.0.0:8080
Code Quality

npm run check             # Run svelte-check for type checking
npm run lint              # Run prettier and eslint checks
npm run format            # Format code with prettier

$3

- TRANSCRIPT_DIR: Path to transcript directory (defaults to ./transcripts) -PORT: Server port (defaults to 3000) -HOST: Server host (defaults to localhost)

`Development`

`$3`


- Framework: SvelteKit 5 with Svelte 5 runes
- Styling: Tailwind CSS 4 + DaisyUI 5
- Server: Node adapter with Express for CLI mode
- Data Structures: TanStack Table + TanStack Virtual for virtualization
$3
#### Score Color Scheme
Scores are color-coded using a translucent gradient:
- 1-2.5: White with gray border
- 2.6-5: Translucent yellow
- 5.1-7.5: Translucent orange
- 7.6-10: Translucent red

The color logic is centralized in src/lib/shared/score-utils.ts.

`$3`