Real-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly
npm install whisper-web-transcriberReal-time audio transcription in the browser using OpenAI's Whisper model via WebAssembly. This package provides an easy-to-use API for integrating speech-to-text capabilities into web applications without any server-side processing.
Live Demo 🎙️ | Live Usage on Real Site 🚀
- 🎙️ Real-time audio transcription from microphone
- 🌐 Runs entirely in the browser (no server required)
- 📦 Multiple Whisper model options (tiny, base, quantized versions)
- 💾 Automatic model caching in IndexedDB
- 🔧 Simple, promise-based API
- 📱 Works on all modern browsers with WebAssembly support
- 🌍 Platform-independent (same WASM works on all OS)
bash
npm install whisper-web-transcriber
`Or using yarn:
`bash
yarn add whisper-web-transcriber
`$3
`html
`
Quick Start
$3
`javascript
import { WhisperTranscriber } from 'whisper-web-transcriber';const transcriber = new WhisperTranscriber({
modelSize: 'base-en-q5_1',
onTranscription: (text) => {
console.log('Transcribed:', text);
}
});
await transcriber.loadModel();
await transcriber.startRecording();
`$3
`html
`API Reference
$3
`typescript
interface WhisperConfig {
modelUrl?: string; // Custom model URL (optional)
modelSize?: 'tiny.en' | 'base.en' | 'tiny-en-q5_1' | 'base-en-q5_1';
sampleRate?: number; // Audio sample rate (default: 16000)
audioIntervalMs?: number; // Audio processing interval (default: 5000ms)
onTranscription?: (text: string) => void;
onProgress?: (progress: number) => void;
onStatus?: (status: string) => void;
debug?: boolean; // Enable debug logging (default: false)
}
`$3
-
loadModel(): Promise - Downloads and initializes the Whisper model
- startRecording(): Promise - Starts microphone recording and transcription
- stopRecording(): void - Stops recording
- destroy(): void - Cleanup resources
- getServiceWorkerCode(): string | null - Returns the COI service worker code (bundled version only)
- getCrossOriginIsolationInstructions(): string - Returns setup instructions for Cross-Origin IsolationModel Options
| Model | Size | Description |
|-------|------|-------------|
|
tiny.en | 75 MB | Fastest, lower accuracy |
| base.en | 142 MB | Better accuracy, slower |
| tiny-en-q5_1 | 31 MB | Quantized tiny model, smaller size |
| base-en-q5_1 | 57 MB | Quantized base model, good balance |Browser Requirements
- WebAssembly support
- SharedArrayBuffer support (requires Cross-Origin Isolation)
- Microphone access permission
- Modern browser (Chrome 90+, Firefox 89+, Safari 15+, Edge 90+)
Cross-Origin Isolation Setup
WhisperTranscriber requires SharedArrayBuffer, which needs Cross-Origin Isolation. You have two options:
$3
Configure your server to send these headers:
`
Cross-Origin-Embedder-Policy: require-corp
Cross-Origin-Opener-Policy: same-origin
`$3
If you can't modify server headers, use the included service worker:For NPM users:
`html
`For CDN users:
`javascript
// Get the service worker code
const transcriber = new WhisperTranscriber.WhisperTranscriber();
const swCode = transcriber.getServiceWorkerCode();// Save swCode as 'coi-serviceworker.js' on YOUR domain
// Then include it in your HTML:
//
`Important: Service workers must be served from the same origin as your page. CDN users cannot directly use the service worker from unpkg.
$3
For local development:
`bash
npm run demo
`For production (examples):
Vercel (
vercel.json):
`json
{
"headers": [
{
"source": "/(.*)",
"headers": [
{
"key": "Cross-Origin-Embedder-Policy",
"value": "require-corp"
},
{
"key": "Cross-Origin-Opener-Policy",
"value": "same-origin"
}
]
}
]
}
`Nginx:
`nginx
add_header Cross-Origin-Embedder-Policy "require-corp" always;
add_header Cross-Origin-Opener-Policy "same-origin" always;
`Complete Examples
$3
`html
Whisper Transcriber - NPM Version
`$3
`html
Whisper Transcriber - CDN Version
`
Bundled vs Standard Version
$3
- ✅ Single file - All workers and dependencies included
- ✅ CDN-friendly - No CORS issues with web workers
- ✅ Zero configuration - Works out of the box (except for Cross-Origin Isolation)
- ❌ Larger initial download - ~220KB uncompressed, ~95KB minified
- 📦 Best for: Quick prototypes, CDN usage, simple deployments$3
- ✅ Smaller initial size - Core library only
- ✅ Modular loading - Workers loaded on demand
- ❌ Requires all files - Must serve worker files from same origin
- ❌ More complex setup - Need to copy files from node_modules
- 📦 Best for: Production apps with bundlers, optimized loadingPerformance Considerations
- Transcription is CPU-intensive
- Larger models provide better accuracy but require more processing power
- Quantized models (Q5_1) offer good balance between size and quality
- First-time model loading may take time (models are cached afterward)
Troubleshooting
$3
You need to enable Cross-Origin Isolation. See the Cross-Origin Isolation Setup section.$3
Use the bundled version (index.bundled.min.js`) instead of the standard version.Built using:
- whisper.cpp compiled to WebAssembly
- Web Audio API for microphone access
- IndexedDB for model caching
- Service Worker for Cross-Origin Isolation
MIT
Contributions are welcome! Please feel free to submit a Pull Request.
- whisper.cpp by Georgi Gerganov
- OpenAI Whisper for the original model