![NPM Version](https://www.npmjs.com/package/@transcribe/transcriber)

Transcribe.js

Transcribe speech to text in the browser. Based on a wasm build of whisper.cpp.

Note: This package is browser only. Node.js is not supported. (see this discussion for details)

- Docs
- Example File Transcriber
- Example Stream Transcriber (experimental)
- Code Examples

Packages

All packages are under @transcribe namespace.

| Package | Description |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------- |
| @transcribe/shout | Wasm build based off whisper.cpp. shout.wasm.js contains the wasm binary and the worker file. |
| @transcribe/transcriber | FileTranscriber and StreamTranscriber for transcribing media files or streams. |

Prerequisite

$3

Your webserver must serve the files with cross origin headers.

"Cross-Origin-Embedder-Policy": "require-corp"
"Cross-Origin-Opener-Policy": "same-origin"

$3

Your browser must support SharedArrayBuffer. (brower support)

The default wasm files are built with SIMD enabled. If your browser/device doens't support SIMD use the no-simd files instead. Also check out the example code on how to use it. (brower support)

$3

You need a ggml model file to run Transcribe.js. You can download them on hugging face https://huggingface.co/ggerganov/whisper.cpp/tree/main . You should start with the (quantized) tiny or base models. Larger models propably won't work but you can try it, though.

Installation

Svelte
SvelteKit
Vue
Angular
Other

$3

Install shout wasm and transcriber packages

``bash npm install --save @transcribe/transcriber`

The shout.wasm files must be accessable and served by your webserver. Depending on your project setup you may need to copy them from node_modules to your public directory.

`bash

`copy shout wasm`


cp node_modules/@transcribe/shout/src/shout/shout.wasm.js /your/project
optional: copy no-simd build

cp node_modules/@transcribe/shout/src/shout/shout.wasm_no-simd.js /your/project
optional: copy audio-worklets, only needed if you want to use StreamTranscriber

cp -r node_modules/@transcribe/transcriber/src/audio-worklets /your/project

$3

You can use Transcribe.js without a bundler or package manager. Download the files from this repository, copy the src/* directories to your webserver and include the following into your HTML. Make sure to set the correct paths in the import map.

`html

`Usage`

For full code examples and advanced usage please see https://www.transcribejs.dev or check out the File Transcriber Example.

`js import createModule from "@transcribe/shout"; // if you use import map or bundler like vite // import createModule from "/your/project/shout.wasm.js"; // you can also exclude @transcibe/shout from your bundler and import manually import { FileTranscriber } from "@transcribe/transcriber";

// create new instance const transcriber = new FileTranscriber({ createModule, // create module function from emscripten build model: "/your/project/ggml-tiny-q5_1.bin", // path to ggml model file });

// init wasm transcriber worker await transcriber.init();

// transcribe audio/video file const result = await transcriber.transcribe("/your/project/my.mp3");

console.log(result);`

The result is an JSON object containg the text segements and timestamps.

`js { "result": { "language": "en" }, "transcription": [ { "timestamps": { "from": "00:00:00,000", "to": "00:00:11,000" }, "offsets": { "from": 0, "to": 11000 }, "text": " And so my fellow Americans ask not what your country can do for you, ask what you can do for your country.", "tokens": [ { "text": " And", "timestamps": { "from": "00:00:00,320", "to": "00:00:00,350" }, "offsets": { "from": 320, "to": 350 }, "id": 400, "p": 0.726615 // propability, aka. how likely the estimate is true, 0..1, 1 is best }, // ... one token per word ] } ] }`

`Development`

Install Emscripten and its required tools.

Clone the repository, install dependencies, start the dev server and open http://localhost:9876/examples/index.html in your browser.

`bash git clone https://github/transcribejs/transcribe.js cd transcribe npm install npm run dev`

`$3`

The library is not written in typescript. This way no extra build step is needed during development and in production.

To still get proper type support type definitions get generated from JSDoc comments.

`bash npm run generate-types`

`$3`

The whisper.cpp repository is a git submodule. To get the latest version of whisper.cpp go into the directory and pull the latest changes from github.

`bash cd shout.wasm/whisper.cpp git pull origin master`

The wasm files are build from shout.wasm/src/shout.wasm.cpp. If you want to add new functions from whisper.cpp to the wasm build this is the file to add them.

> I'm pretty sure that this will not compile on every machine/architecture, but I'm no expert in C++. If you know how to optimize the build process please let me know or create a pull request. Maybe this should be dockerized.?

`bash

`run cmake to build wasm`


npm run wasm:build
copy emscripten build files to project

npm run wasm:copy

$3

Unit/functional tests for the Transcriber functions.

`bash npm run test:unit`

E2E tests using Playwright. Firefox somehow needs waaaaaay longer during e2e test than in a the "real" browser.

`bash npm run test:e2e`

or use the Playwright UI for details

`bash npm run test:e2e-ui`

`Credits`

`$3`

Many thanks to the people who supported this project, be it through code, ideas or general testing. I appreciate your time and effort.

- @MarketingPip - testing on older devices

`$3`

Also thank you to the creators and contributors of the following open source libraries that were used in this project:

- whisper.cpp: A C++ implementation of whisper. GitHub Repository - emscripten: A toolchain for compiling C and C++ code to WebAssembly. Official Site - water.css: A minimal CSS framework for styling HTML. Official Site - fft.js: A library for Fast Fourier Transform calculations. GitHub Repository - Moattar, Mohammad & Homayoonpoor, Mahdi. (2010). A simple but efficient real-time voice activity detection algorithm. Research Paper - vitest: A website for testing voice recognition. Official Site - Playwright: A tool for automating browser testing. Official Site

`$3`

- examples/albert.oggRadio Universidad Nacional de La Plata, CC BY-SA 3.0, via Wikimedia Commons -examples/jfk.wav`: CC BY-SA 3.0, via Wikimedia Commons

$3

This project is tested with BrowserStack

![NPM Version](https://www.npmjs.com/package/@transcribe/transcriber)

Transcribe.js

Transcribe speech to text in the browser. Based on a wasm build of whisper.cpp.

Note: This package is browser only. Node.js is not supported. (see this discussion for details)

- Docs
- Example File Transcriber
- Example Stream Transcriber (experimental)
- Code Examples

Packages

All packages are under @transcribe namespace.

Prerequisite

$3

Your webserver must serve the files with cross origin headers.

"Cross-Origin-Embedder-Policy": "require-corp"
"Cross-Origin-Opener-Policy": "same-origin"

$3

Your browser must support SharedArrayBuffer. (brower support)

The default wasm files are built with SIMD enabled. If your browser/device doens't support SIMD use the no-simd files instead. Also check out the example code on how to use it. (brower support)

$3

Installation

Svelte
SvelteKit
Vue
Angular
Other

$3

Install shout wasm and transcriber packages

``bash npm install --save @transcribe/transcriber`

The shout.wasm files must be accessable and served by your webserver. Depending on your project setup you may need to copy them from node_modules to your public directory.

`bash

`copy shout wasm`


cp node_modules/@transcribe/shout/src/shout/shout.wasm.js /your/project
optional: copy no-simd build

cp node_modules/@transcribe/shout/src/shout/shout.wasm_no-simd.js /your/project
optional: copy audio-worklets, only needed if you want to use StreamTranscriber

cp -r node_modules/@transcribe/transcriber/src/audio-worklets /your/project

$3

`html

`Usage`

For full code examples and advanced usage please see https://www.transcribejs.dev or check out the File Transcriber Example.

// init wasm transcriber worker await transcriber.init();

// transcribe audio/video file const result = await transcriber.transcribe("/your/project/my.mp3");

console.log(result);`

The result is an JSON object containg the text segements and timestamps.

`Development`

Install Emscripten and its required tools.

Clone the repository, install dependencies, start the dev server and open http://localhost:9876/examples/index.html in your browser.

`bash git clone https://github/transcribejs/transcribe.js cd transcribe npm install npm run dev`

`$3`

The library is not written in typescript. This way no extra build step is needed during development and in production.

To still get proper type support type definitions get generated from JSDoc comments.

`bash npm run generate-types`

`$3`

The whisper.cpp repository is a git submodule. To get the latest version of whisper.cpp go into the directory and pull the latest changes from github.

`bash cd shout.wasm/whisper.cpp git pull origin master`

The wasm files are build from shout.wasm/src/shout.wasm.cpp. If you want to add new functions from whisper.cpp to the wasm build this is the file to add them.

`bash

`run cmake to build wasm`


npm run wasm:build
copy emscripten build files to project

npm run wasm:copy

$3

Unit/functional tests for the Transcriber functions.

`bash npm run test:unit`

E2E tests using Playwright. Firefox somehow needs waaaaaay longer during e2e test than in a the "real" browser.

`bash npm run test:e2e`

or use the Playwright UI for details

`bash npm run test:e2e-ui`

`Credits`

`$3`

Many thanks to the people who supported this project, be it through code, ideas or general testing. I appreciate your time and effort.

- @MarketingPip - testing on older devices

`$3`

Also thank you to the creators and contributors of the following open source libraries that were used in this project:

`$3`

- examples/albert.oggRadio Universidad Nacional de La Plata, CC BY-SA 3.0, via Wikimedia Commons -examples/jfk.wav`: CC BY-SA 3.0, via Wikimedia Commons

$3

This project is tested with BrowserStack