Whisper Web Transcriber

Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly. This package provides an easy-to-use API for integrating speech-to-text capabilities into web applications without any server-side processing.

Live Demo 🎙️ | Live Usage on Real Site 🚀

Features

- 🎙️ Real-time audio transcription from microphone
- 🌐 Runs entirely in the browser (no server required)
- 📦 Multiple Whisper model options (tiny, base, quantized versions)
- 💾 Automatic model caching in IndexedDB
- 🔧 Simple, promise-based API
- 📱 Works on all modern browsers with WebAssembly support
- 🌍 Platform-independent (same WASM works on all OS)

Installation

$3

bash
npm install whisper-web-transcriber

Or using yarn:`bash yarn add whisper-web-transcriber`

`$3`

html



Quick Start
$3

javascript
import { WhisperTranscriber } from 'whisper-web-transcriber';
const transcriber = new WhisperTranscriber({
  modelSize: 'base-en-q5_1',
  onTranscription: (text) => {
    console.log('Transcribed:', text);
  }
});

await transcriber.loadModel(); await transcriber.startRecording();`

`$3`

html


API Reference
$3

`typescript interface WhisperConfig { modelUrl?: string; // Custom model URL (optional) modelSize?: 'tiny.en' | 'base.en' | 'tiny-en-q5_1' | 'base-en-q5_1'; sampleRate?: number; // Audio sample rate (default: 16000) audioIntervalMs?: number; // Audio processing interval (default: 5000ms) onTranscription?: (text: string) => void; onProgress?: (progress: number) => void; onStatus?: (status: string) => void; debug?: boolean; // Enable debug logging (default: false) }`

`$3`

- loadModel(): Promise- Downloads and initializes the Whisper model -startRecording(): Promise- Starts microphone recording and transcription -stopRecording(): void- Stops recording -destroy(): void- Cleanup resources -getServiceWorkerCode(): string | null- Returns the COI service worker code (bundled version only) -getCrossOriginIsolationInstructions(): string - Returns setup instructions for Cross-Origin Isolation

`Model Options`

| Model | Size | Description | |-------|------|-------------| |tiny.en| 75 MB | Fastest, lower accuracy | |base.en| 142 MB | Better accuracy, slower | |tiny-en-q5_1| 31 MB | Quantized tiny model, smaller size | |base-en-q5_1 | 57 MB | Quantized base model, good balance |

`Browser Requirements`

- WebAssembly support - SharedArrayBuffer support (requires Cross-Origin Isolation) - Microphone access permission - Modern browser (Chrome 90+, Firefox 89+, Safari 15+, Edge 90+)

`Cross-Origin Isolation Setup`

WhisperTranscriber requires SharedArrayBuffer, which needs Cross-Origin Isolation. You have two options:

`$3`


Configure your server to send these headers:


Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin


$3

If you can't modify server headers, use the included service worker:

For NPM users:`html`

For CDN users:`javascript // Get the service worker code const transcriber = new WhisperTranscriber.WhisperTranscriber(); const swCode = transcriber.getServiceWorkerCode();

// Save swCode as 'coi-serviceworker.js' on YOUR domain // Then include it in your HTML: //`

Important: Service workers must be served from the same origin as your page. CDN users cannot directly use the service worker from unpkg.

`$3`

For local development:`bash npm run demo`

For production (examples):

Vercel (vercel.json):`json { "headers": [ { "source": "/(.*)", "headers": [ { "key": "Cross-Origin-Embedder-Policy", "value": "require-corp" }, { "key": "Cross-Origin-Opener-Policy", "value": "same-origin" } ] } ] }`

Nginx:`nginx add_header Cross-Origin-Embedder-Policy "require-corp" always; add_header Cross-Origin-Opener-Policy "same-origin" always;`

`Complete Examples`

`$3`

`html Whisper Transcriber - NPM Version

`$3`

`html Whisper Transcriber - CDN Version

`Bundled vs Standard Version`

`$3`


- ✅ Single file - All workers and dependencies included
- ✅ CDN-friendly - No CORS issues with web workers
- ✅ Zero configuration - Works out of the box (except for Cross-Origin Isolation)
- ❌ Larger initial download - ~220KB uncompressed, ~95KB minified
- 📦 Best for: Quick prototypes, CDN usage, simple deployments
$3

- ✅ Smaller initial size - Core library only
- ✅ Modular loading - Workers loaded on demand
- ❌ Requires all files - Must serve worker files from same origin
- ❌ More complex setup - Need to copy files from node_modules
- 📦 Best for: Production apps with bundlers, optimized loading
Performance Considerations
- Transcription is CPU-intensive
- Larger models provide better accuracy but require more processing power
- Quantized models (Q5_1) offer good balance between size and quality
- First-time model loading may take time (models are cached afterward)
Troubleshooting
$3

You need to enable Cross-Origin Isolation. See the Cross-Origin Isolation Setup section.
$3

Use the bundled version (

index.bundled.min.js`) instead of the standard version.

$3

Ensure your site is served over HTTPS (or localhost) and the user has granted microphone permissions.

$3

- Service workers must be served from the same origin as your page
- Check browser console for specific error messages
- Ensure the service worker file is accessible at the correct path

Technical Details

Built using:
- whisper.cpp compiled to WebAssembly
- Web Audio API for microphone access
- IndexedDB for model caching
- Service Worker for Cross-Origin Isolation

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

- whisper.cpp by Georgi Gerganov
- OpenAI Whisper for the original model

Whisper Web Transcriber

Live Demo 🎙️ | Live Usage on Real Site 🚀

Features

Installation

$3

bash
npm install whisper-web-transcriber

Or using yarn:`bash yarn add whisper-web-transcriber`

`$3`

html



Quick Start
$3

javascript
import { WhisperTranscriber } from 'whisper-web-transcriber';
const transcriber = new WhisperTranscriber({
  modelSize: 'base-en-q5_1',
  onTranscription: (text) => {
    console.log('Transcribed:', text);
  }
});

await transcriber.loadModel(); await transcriber.startRecording();`

`$3`

html


API Reference
$3

`$3`

`Model Options`

`Browser Requirements`

- WebAssembly support - SharedArrayBuffer support (requires Cross-Origin Isolation) - Microphone access permission - Modern browser (Chrome 90+, Firefox 89+, Safari 15+, Edge 90+)

`Cross-Origin Isolation Setup`

WhisperTranscriber requires SharedArrayBuffer, which needs Cross-Origin Isolation. You have two options:

`$3`


Configure your server to send these headers:


Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin


$3

If you can't modify server headers, use the included service worker:

For NPM users:`html`

For CDN users:`javascript // Get the service worker code const transcriber = new WhisperTranscriber.WhisperTranscriber(); const swCode = transcriber.getServiceWorkerCode();

// Save swCode as 'coi-serviceworker.js' on YOUR domain // Then include it in your HTML: //`

Important: Service workers must be served from the same origin as your page. CDN users cannot directly use the service worker from unpkg.

`$3`

For local development:`bash npm run demo`

For production (examples):

Nginx:`nginx add_header Cross-Origin-Embedder-Policy "require-corp" always; add_header Cross-Origin-Opener-Policy "same-origin" always;`

`Complete Examples`

`$3`

`html Whisper Transcriber - NPM Version

`$3`

`html Whisper Transcriber - CDN Version

`Bundled vs Standard Version`

`$3`


- ✅ Single file - All workers and dependencies included
- ✅ CDN-friendly - No CORS issues with web workers
- ✅ Zero configuration - Works out of the box (except for Cross-Origin Isolation)
- ❌ Larger initial download - ~220KB uncompressed, ~95KB minified
- 📦 Best for: Quick prototypes, CDN usage, simple deployments
$3

- ✅ Smaller initial size - Core library only
- ✅ Modular loading - Workers loaded on demand
- ❌ Requires all files - Must serve worker files from same origin
- ❌ More complex setup - Need to copy files from node_modules
- 📦 Best for: Production apps with bundlers, optimized loading
Performance Considerations
- Transcription is CPU-intensive
- Larger models provide better accuracy but require more processing power
- Quantized models (Q5_1) offer good balance between size and quality
- First-time model loading may take time (models are cached afterward)
Troubleshooting
$3

You need to enable Cross-Origin Isolation. See the Cross-Origin Isolation Setup section.
$3

Use the bundled version (

index.bundled.min.js`) instead of the standard version.

$3

Ensure your site is served over HTTPS (or localhost) and the user has granted microphone permissions.

$3

- Service workers must be served from the same origin as your page
- Check browser console for specific error messages
- Ensure the service worker file is accessible at the correct path

Technical Details

Built using:
- whisper.cpp compiled to WebAssembly
- Web Audio API for microphone access
- IndexedDB for model caching
- Service Worker for Cross-Origin Isolation

License

MIT

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Acknowledgments

- whisper.cpp by Georgi Gerganov
- OpenAI Whisper for the original model