A browser-compatible integration with the Deepgram Voice Agent API.
A pair of experimental web components for integrating with Deepgram's Voice Agent API in a browser environment.
- The agent component is an all-in-one web component that manages the microphone, websocket, and
animation. Add it to any page and get chatting!
- The hoop component is the animation, standalone. More useful when you've got your own rules for
socket integration, and just want the look and feel!
Install via github by adding to your package.json dependencies:
``json`
"@deepgram/browser-agent": "deepgram/browser-agent#main",
Import the library anywhere for the component to be registered to deepgram-agent:
`js`
import "@deepgram/browser-agent";
Then, render it where you like!
`html`
url="wss://agent.deepgram.com/v1/agent/converse"
height="300"
width="300"
idle-timeout-ms="10000"
output-sample-rate="24000"
>
Then add a config attribute after some user interaction to start a connection. See more in the
Attributes section.
- config (optional): stringified json of a SettingsConfiguration to send the API on initializationconfig
- Adding or removing the attribute will start or stop (respectively) the WebSocketdeepgram-agent
connection to the Deepgram API.
- Because this web component directly manages the user's microphone, it requires a user action to
attempt a connection. For that reason, you most likely want to first render the element
_without_ a config.
- For better early API flexibility, there is no validation. Use our docs to ensure your
configuration matches.
- SettingsConfiguration
- Whenever disconnects, unset the config and wait for another user interaction towidth
set it and retrigger connection.
- (optional, default = "0"): the width of the canvas for agent animationheight
- The animation will always take up a (roughly) square area, so this should typically be the same
value as .height
- (optional, default = "0"): the height of the canvas for agent animationwidth
- The animation will always take up a (roughly) square area, so this should typically be the same
value as .auth-scheme
- (optional, default = "bearer"): the auth scheme to use with your tokenbearer
- Use for the Deepgram API when working with token-based authentication. For localtoken
development you may find it more convenient to use an API key ( scheme). **Never use APIurl
keys in a production browser application!**
- (required): The API url"https://api.deepgram.com/v1/agent"
- Chances are you'll set this to !idle-timeout-ms
- (optional): how long to wait for user idleness before closing the socketAgentAudioDone
- Timer starts whenever the user is expected to speak (meaning right when opening the connection,
and right after each event).output-sample-rate
- : the output sample rate you'd like for playback
- Should be the same as the output rate you've got in your Settings object. Unless you're trying
to have a little fun.
- token (optional): the token to use for accessing the Deepgram /agent API. See the token-basedauth-scheme
auth docs for how to
create safe-for-browser tokens.
- If not provided, the will also be ignored. Only makes sense if your API URL is
unauthenticated.
As an experimental tech, the deepgram-agent element emits a variety of events. You're more likely
to run into some than others.
#### Common events
- "no url": emitted when trying to connect and API url is missing"no config"
- : emitted when trying to connect and config is missing"invalid auth"
- : emitted when trying to connect and the WebSocket rejects the auth scheme or"socket open"
token
- : socket successfully opened"socket close"
- : socket successfully closed"connection timeout"
- : socket failed to connect due to a timeout (10s)"failed to connect user media"
- : couldn't gain access to user's microphone, usually due to"structured message"
permission rejection
- : got JSON from the API. This is the main event to pay attention to!"client message"
- : sent a JSON message to the API. Useful for debugging.
#### Uncommon events
- "failed setup": some issue internal to the custom element occurred"empty audio"
- : got an empty message when expecting audio data"unknown message"
- : got a text message from the API that isn't valid JSON
`ts`
sendClientMessage(message: ArrayBuffer | string): void {}
Use this to send some (stringified) JSON or binary data to the server. Ignored when the websocket is
closed.
`ts`
connect(): Promise
Use this to explicitly connect. Prefer to handle this by setting the config attribute.
`ts`
disconnect(reason?: string): Promise
Use this to explicitly disconnect. Prefer to handle this by _removing_ the config attribute.
The animation alone is available as a granular import, automatically registered as deepgram-hoop:
`js`
import "@deepgram/browser-agent/hoop";
Then, render it where you like!
`html`
height="300"
width="300"
status="active"
>
The hoop component applies some size oscillation based on audio information:
- The output, i.e. agent audio (agent-volume attribute) expandsuser-volume
- The input, i.e. user audio ( attribute) collapses
To ease jitter, each drawn arc trails behind a leader. You must provide amplitude data for both the
user and agent on a per-frame basis. See the sendVolumeUpdates function for a working example.
- Node v18 or 20 (though I recommend installing it through
nvm)
Use npm run vite to start a dev server. You'll need to set a DG_API_KEY` environment variable in
order to open a connection.