
npm install ac-ms-bing-speech-service!deprecated label
This library is now deprecated due to Microsoft's release of an official websocket supported NodeJS/JavaScript SDK for Microsoft Speech Service. Please use that instead of this. Thanks! 🙇🏼♀️
---
npm install ms-bing-speech-service
npm install ms-bing-speech-service from your project directory
js
const speechService = require('ms-bing-speech-service');
const options = {
language: 'en-US',
subscriptionKey: ''
};
const recognizer = new speechService(options);
recognizer
.start()
.then(_ => {
recognizer.on('recognition', (e) => {
if (e.RecognitionStatus === 'Success') console.log(e);
});
recognizer.sendFile('future-of-flying.wav')
.then(_ => console.log('file sent.'))
.catch(console.error);
})
.catch(console.error);
`
You can also use this library with the async/await pattern!
`js
const speechService = require('ms-bing-speech-service');
(async function() {
const options = {
language: 'en-US',
subscriptionKey: ''
};
const recognizer = new speechService(options);
await recognizer.start();
recognizer.on('recognition', (e) => {
if (e.RecognitionStatus === 'Success') console.log(e);
});
recognizer.on('turn.end', async (e) => {
console.log('recognizer is finished.');
await recognizer.stop();
console.log('recognizer is stopped.');
});
await recognizer.sendFile('future-of-flying.wav');
console.log('file sent.');
})();
`
And in the browser (a global window distribution is also available in dist directory). Use an ArrayBuffer instance in place of a file path:
`js
import speechService from 'MsBingSpeechService';
const file = myArrayBuffer;
const options = {
language: 'en-US',
subscriptionKey: ''
}
const recognizer = new speechService(options);
recognizer.start()
.then(_ => {
console.log('service started');
recognizer.on('recognition', (e) => {
if (e.RecognitionStatus === 'Success') console.log(e);
});
recognizer.sendFile(file);
}).catch((error) => console.error('could not start service:', error));
`
The above examples will use your subscription key to create an access token with Microsoft's service.
In some instances you may not want to share your subscription key directly with your application. If you're creating an app with multiple users, you may want to issue access tokens from an external API so each user can connect to the speech service without exposing your subscription key.
To do this, replace "subscriptionKey" in the above code example with "accessToken" and pass in the provided token.
`js
const options = {
language: 'en-US',
accessToken: ''
};
`
$3
Yes! You can totally use this with Custom Speech Service. You'll need a few more details in your options object, though.
Your subscriptionKey will be the key displayed on your custom endpoint deployment page in the Custom Speech Management Portal. There, you can also find your websocket endpoint of choice to use.
The following code will get you up and running with the Custom Speech Service:
`js
const speechService = require('ms-bing-speech-service');
const options = {
subscriptionKey: '',
serviceUrl: 'wss://.api.cris.ai/speech/recognition/conversation/cognitiveservices/v1',
issueTokenUrl: 'https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken'
};
const recognizer = new speechService(options);
recognizer
.start()
.then(_ => {
recognizer.on('recognition', (e) => {
if (e.RecognitionStatus === 'Success') console.log(e);
});
recognizer.sendFile('future-of-flying.wav');
}
}).catch(console.error);
`
See the API section of these docs for details on configuration and methods.
API Reference
$3
$3
+ options _Object_
+ Returns SpeechService
Creates a new instance of SpeechService.
`js
const recognizer = new SpeechService(options);
`
Available options are below:
| name | type | description | default | required |
|---------------------------|-----------|----------------------------------------------------------------------------------------------------------|---------|----------|
| subscriptionKey | String | your Speech API key | n/a | yes |
| accessToken | String | your Speech access token. Only required if subscriptionKey option not supplied. | n/a | no |
| language | String | the language you want to translate from. See supported languages in the official Microsoft Speech API docs. | 'en-US' | no |
| mode | String | which recognition mode you'd like to use. Choose from interactive, conversation, or dictation | 'conversation' | no |
| format | String | file format you'd like the text to speech to be returned as. Choose from simple or detailed | 'simple' | no |
$3
Connects to the Speech API websocket on your behalf. Returns a promise.
`js
recognizer.start().then(() => {
console.log('recognizer service started.');
}).catch(console.error);
`
$3
Disconnects from the established websocket connection to the Speech API. Returns a promise.
`js
recognizer.stop().then(() => {
console.log('recognizer service stopped.');
}).catch(console.error);
`
$3
+ stream _Readable Stream_
Sends an audio payload stream to the Speech API websocket connection. Audio payload is a native NodeJS Buffer stream (eg. a readable stream) or an ArrayBuffer in the browser. Returns a promise.
See the 'Sending Audio' section of the official Speech API docs for details on the data format needed.
NodeJS example:
`js
const fs = require('fs');
const audioStream = fs.createReadableStream('speech.wav');
recognizer.sendStream(audioStream).then(() => {
recognizer.on('recognition', (message) => {
console.log('new recognition:', message);
});
console.log('stream sent.');
}).catch(console.error);
`
$3
+ filepath _String_
Streams an audio file from disk to the Speech API websocket connection. Also accepts a NodeJS Buffer or browser ArrayBuffer. Returns a promise.
See the 'Sending Audio' section of the official Speech API docs for details on the data format needed for the audio file.
`js
recognizer.sendFile('/path/to/audiofile.wav').then(() => {
console.log('file sent.');
}).catch(console.error);
`
or
`js
fetch('speech.wav')
.then((response) => response.arrayBuffer())
.then((audioBuffer) => recognizer.sendFile(audioBuffer))
.then((recognizer) => console.log('file sent'))
.catch((error) => console.log('something went wrong:', error));
`
$3
You can listen to the following events on the recognizer instance:
$3
+ callback _Function_
Event listener for incoming recognition message payloads from the Speech API. Message payload is a JSON object.
`js
recognizer.on('recognition', (message) => {
console.log('new recognition:', message);
});
`
$3
+ callback _Function_
Event listener for Speech API websocket connection closures.
`js
recognizer.on('close', (error) => {
console.log('Speech API connection closed');
// you can optionally look for an error object (most closures currently report a 1006 even when intentional closure happens but we're looking into it!)
console.log(error);
});
`
$3
+ callback _Function_
Event listener for incoming Speech API websocket connection errors.
`js
recognizer.on('error', (error) => {
console.log(error);
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'turn.start' event. Fires when service detects an audio stream.
`js
recognizer.on('turn.start', () => {
console.log('start turn has fired.');
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'turn.end' event. Fires after 'speech.endDetected' event and the turn has ended. This event is an ideal one to listen to in order to be notified when an entire stream of audio has been processed and all results have been received.
`js
recognizer.on('turn.end', () => {
console.log('end turn has fired.');
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'speech.startDetected' event. Fires when the service has first detected speech in the audio stream.
`js
recognizer.on('speech.startDetected', () => {
console.log('speech startDetected has fired.');
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'speech.endDetected' event. Fires when the service has stopped being able to detect speech in the audio stream.
`js
recognizer.on('speech.endDetected', () => {
console.log('speech endDetected has fired.');
});
`
$3
+ callback _Function_
Identical to the recognition event. Event listener for incoming recognition message payloads from the Speech API. Message payload is a JSON object.
`js
recognizer.on('speech.phrase', (message) => {
console.log('new phrase:', message);
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'speech.hypothesis' event. Only fires when using interactive mode. Contains incomplete recognition results. This event will fire often - beware!
`js
recognizer.on('speech.hypothesis', (message) => {
console.log('new hypothesis:', message);
});
`
$3
+ callback _Function_
Event listener for Speech API websocket 'speech.fragment' event. Only fires when using dictation mode. Contains incomplete recognition results. This event will fire often - beware!
`js
recognizer.on('speech.fragment', (message) => {
console.log('new fragment:', message);
});
``