Comprehensive Voice Agent SDK with Customizable Widget - Real-time audio, WebSocket communication, React components, and extensive customization options
npm install ttp-agent-sdkThis repository contains SDKs for integrating with the TTP Agent API:
- Frontend SDK (JavaScript) - Browser-based SDK for web applications
- Backend SDK (Java) - Server-side SDK for phone systems and backend processing
A comprehensive JavaScript SDK for voice interaction with AI agents. Provides real-time audio recording, WebSocket communication, and audio playback with queue management.
- 🎤 Real-time Audio Recording - Uses AudioWorklet for high-quality audio capture
- 🔄 WebSocket Communication - Real-time bidirectional communication with authentication
- 🔊 Audio Playback Queue - Smooth audio playback with queue management
- ⚛️ React Components - Ready-to-use React components
- 🌐 Vanilla JavaScript - Works with any JavaScript framework
- 🎯 Event-driven - Comprehensive event system for all interactions
- 🔒 Multiple Authentication Methods - Support for signed links and direct agent access
- 📱 Responsive Widget - Pre-built UI widget for quick integration
``bash`
npm install ttp-agent-sdk
`javascript
import { VoiceSDK } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
agentId: 'your_agent_id',
appId: 'your_app_id',
voice: 'default',
language: 'en'
});
// Connect and start recording
await voiceSDK.connect();
await voiceSDK.startRecording();
`
`javascript
import { VoiceSDK } from 'ttp-agent-sdk';
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
// No agentId needed - server validates signed token from URL
});
// Connect using signed URL
await voiceSDK.connect();
`
`html`
Or use a function to fetch the signed URL:
`html`
`jsx
import React from 'react';
import { VoiceButton } from 'ttp-agent-sdk';
function App() {
return (
agentId="your_agent_id"
appId="your_app_id"
onConnected={() => console.log('Connected!')}
onRecordingStarted={() => console.log('Recording...')}
onPlaybackStarted={() => console.log('Playing audio...')}
/>
);
}
`
The main SDK class for voice interaction.
#### Constructor Options
`javascript`
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv', // Required
agentId: 'agent_12345', // Optional - for direct agent access
appId: 'app_67890', // Optional - user's app ID for authentication
ttpId: 'ttp_abc123', // Optional - custom TTP ID (fallback)
voice: 'default', // Optional - voice selection
language: 'en', // Optional - language code
sampleRate: 16000, // Optional - audio sample rate
autoReconnect: true // Optional - auto-reconnect on disconnect
});
#### Methods
- connect() - Connect to the voice serverdisconnect()
- - Disconnect from the voice serverstartRecording()
- - Start voice recordingstopRecording()
- - Stop voice recordingtoggleRecording()
- - Toggle recording statedestroy()
- - Clean up resources
#### Events
- connected - WebSocket connecteddisconnected
- - WebSocket disconnectedrecordingStarted
- - Recording startedrecordingStopped
- - Recording stoppedplaybackStarted
- - Audio playback startedplaybackStopped
- - Audio playback stoppederror
- - Error occurredmessage
- - Received message from server
A React component that provides a voice interaction button.
#### Props
`jsx`
agentId="agent_12345"
appId="app_67890"
voice="default"
language="en"
autoReconnect={true}
onConnected={() => {}}
onDisconnected={() => {}}
onRecordingStarted={() => {}}
onRecordingStopped={() => {}}
onPlaybackStarted={() => {}}
onPlaybackStopped={() => {}}
onError={(error) => {}}
onMessage={(message) => {}}
/>
A pre-built widget for quick integration.
#### Configuration
`javascript`
TTPAgentSDK.AgentWidget.init({
agentId: 'your_agent_id', // Required
getSessionUrl: 'https://your-api.com/get-session', // Required - URL or function
variables: { // Optional - dynamic variables
userName: 'John Doe',
page: 'homepage'
},
position: 'bottom-right', // Optional - widget position
primaryColor: '#4F46E5' // Optional - theme color
});
Use Case: Development, testing, or internal applications.
`javascript`
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.talktopc.com/ws/conv',
agentId: 'agent_12345', // Visible in network traffic
appId: 'app_67890'
});
Security Risk: Agent ID is visible in network traffic.
Use Case: Production applications where security is critical.
`javascript`
const voiceSDK = new VoiceSDK({
websocketUrl: 'wss://speech.bidme.co.il/ws/conv?signed_token=eyJ...'
// No agentId needed - server validates signed token
});
Benefits: Secure, cost-controlled, and production-ready.
`javascript
// Hello message (sent on connection)
{
t: "hello",
appId: "app_67890" // or ttpId for fallback
}
// Start continuous mode
{
t: "start_continuous_mode",
ttpId: "sdk_abc123_1234567890"
}
// Stop continuous mode
{
t: "stop_continuous_mode",
ttpId: "sdk_abc123_1234567890"
}
`
`javascript
// Text response
{
type: "agent_response",
agent_response: "Hello! How can I help you?"
}
// User transcript
{
type: "user_transcript",
user_transcription: "Hello there"
}
// Barge-in detection
{
type: "barge_in",
message: "User interrupted"
}
// Stop playing request
{
type: "stop_playing",
message: "Stop all audio"
}
`
)The capture_screen tool allows AI agents to capture screenshots of the browser page during conversations. It uses html2canvas to render the DOM as an image.
When capturing the entire screen, the tool has two modes:
1. Viewport only (default) - Captures only the visible portion of the page (document.body)fullPage: true
2. Full scrollable page - When , captures the entire page including all content below the fold
Example: Capture entire visible viewport
`javascript`
// The agent can call this tool with:
{
tool: "capture_screen",
params: {
format: "jpeg", // 'png' or 'jpeg' (jpeg is smaller)
quality: 0.85, // JPEG quality 0-1 (only for jpeg)
scale: 1, // Resolution scale (1 = normal, 2 = retina)
maxWidth: 1280, // Max width in pixels (auto-resizes if larger)
maxHeight: 1280 // Max height in pixels (auto-resizes if larger)
}
}
Example: Capture full scrollable page
`javascript`
// To capture the entire page including content below the fold:
{
tool: "capture_screen",
params: {
fullPage: true, // Capture entire scrollable page
format: "jpeg",
quality: 0.85,
maxWidth: 1920, // Higher limits for full page
maxHeight: 5000 // Accommodate long pages
}
}
How Full Page Capture Works:
When fullPage: true is set (and no selector is provided), the tool:windowHeight
- Sets and height to document.documentElement.scrollHeight (total page height)y: 0
- Sets to start from the top
- Captures the entire scrollable content, not just the visible viewport
- The resulting image height equals the full scrollable height of the page
Example: Capture specific element
`javascript`
// Capture a specific element using CSS selector:
{
tool: "capture_screen",
params: {
selector: "#my-element", // CSS selector (e.g., "#header", ".content", "main")
format: "png", // PNG preserves transparency
scale: 2 // 2x resolution for crisp screenshots
}
}
#### Tool Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| selector | string | null | CSS selector for specific element. If not provided, captures entire page. |format
| | string | 'jpeg' | Output format: 'png' or 'jpeg'. JPEG is smaller, PNG supports transparency. |quality
| | number | 0.85 | JPEG quality (0-1). Only used when format is 'jpeg'. |scale
| | number | 1 | Resolution scale factor. 1 = normal, 2 = retina/2x resolution. |maxWidth
| | number | 1280 | Maximum width in pixels. Image will be resized if larger (maintains aspect ratio). |maxHeight
| | number | 1280 | Maximum height in pixels. Image will be resized if larger (maintains aspect ratio). |fullPage
| | boolean | false | When true and no selector provided, captures entire scrollable page (not just viewport). |
#### Return Value
The tool returns a result object with:
`javascript`
{
image: "base64_encoded_image_data", // Base64 string (without data URI prefix)
mimeType: "image/jpeg", // MIME type: "image/jpeg" or "image/png"
width: 1280, // Final image width in pixels
height: 720, // Final image height in pixels
captureTimeMs: 234, // Time taken to capture in milliseconds
selector: "body" // Element that was captured (or selector used)
}
#### Events
The SDK emits events when screenshots are captured:
`javascriptScreenshot captured: ${data.width}x${data.height}, ${data.sizeKB}KB
voiceSDK.on('screenshotCaptured', (data) => {
console.log();Element: ${data.selector}
console.log();
});
voiceSDK.on('screenshotError', (error) => {
console.error('Screenshot failed:', error.error);
});
`
#### How It Works Internally
The tool uses html2canvas to render the DOM:
1. Target Selection: If selector is provided, captures that element. Otherwise captures document.body.fullPage: true
2. Full Page Mode: When and no selector:document.documentElement.scrollHeight
- Sets canvas height to y: 0
- Captures from to capture entire scrollable areahtml2canvas
3. Rendering: renders the DOM to a canvas elementmaxWidth
4. Resizing: If dimensions exceed /maxHeight, image is resized maintaining aspect ratio
5. Encoding: Canvas is converted to base64 (JPEG or PNG format)
#### Important Notes
- Full Page Capture: When fullPage: true is set, the tool captures the entire scrollable height (document.documentElement.scrollHeight), not just the visible viewport. This includes all content below the fold.maxHeight
- Performance: Full page captures may take longer, especially for very long pages. Consider using to limit the capture size.maxWidth
- Image Size: Screenshots are automatically resized if they exceed or maxHeight while maintaining aspect ratio.useCORS: true
- Cross-Origin: The tool handles cross-origin images when possible ().
- Browser Compatibility: Requires modern browsers with Canvas API support (Chrome 66+, Firefox 60+, Safari 11.1+, Edge 79+).
)The scroll_to_element tool allows AI agents to scroll the page in three different ways: scrolling to a specific element, relative scrolling (up/down), or scrolling to an absolute position.
The tool supports three scroll modes with the following priority order:
1. Element Scrolling (highest priority) - Scroll to a specific element by CSS selector
2. Relative Scrolling - Scroll up or down by a specified number of pixels
3. Absolute Position Scrolling - Scroll to specific x/y coordinates
Scroll to a specific element on the page using a CSS selector.
Example: Scroll to element
`javascript`
// The agent can call this tool with:
{
tool: "scroll_to_element",
params: {
selector: "#contact-section", // CSS selector
position: "center", // 'center', 'top', or 'bottom'
behavior: "smooth" // 'smooth' or 'instant'
}
}
Example: Scroll element to top
`javascript`
{
tool: "scroll_to_element",
params: {
selector: ".header",
position: "top", // Scrolls element to top of viewport
behavior: "smooth"
}
}
Scroll the page up or down by a specified number of pixels. This is useful for incremental scrolling or scrolling by viewport height.
Example: Scroll down (default 500px)
`javascript`
{
tool: "scroll_to_element",
params: {
direction: "down",
amount: 500, // Pixels to scroll (default: 500)
behavior: "smooth"
}
}
Example: Scroll up (custom amount)
`javascript`
{
tool: "scroll_to_element",
params: {
direction: "up",
amount: 500, // Scroll up 500 pixels
behavior: "smooth"
}
}
Example: Scroll down by viewport height
`javascript`
{
tool: "scroll_to_element",
params: {
direction: "down",
amount: window.innerHeight, // Scroll one viewport height
behavior: "smooth"
}
}
Features:
- Automatically prevents scrolling beyond page boundaries (won't scroll below 0 or above max scroll)
- Returns atTop and atBottom flags to indicate if scroll limits were reached
- Respects smooth/instant scrolling behavior
Scroll to specific x/y coordinates on the page.
Example: Scroll to absolute position
`javascript`
{
tool: "scroll_to_element",
params: {
x: 0, // Horizontal position (optional)
y: 1000, // Vertical position
behavior: "smooth"
}
}
Example: Scroll to top of page
`javascript`
{
tool: "scroll_to_element",
params: {
x: 0,
y: 0,
behavior: "instant" // Instant scroll to top
}
}
| Parameter | Type | Required | Default | Description |
|-----------|------|----------|---------|-------------|
| selector | string | No* | - | CSS selector for element to scroll to. Takes highest priority if provided. |position
| | string | No | 'center' | Position for element scrolling: 'center', 'top', or 'bottom'. Only used with selector. |direction
| | string | No* | - | Direction for relative scrolling: 'up' or 'down'. Only used if no selector provided. |amount
| | number | No | 500 | Pixels to scroll for relative scrolling. Only used with direction. |x
| | number | No* | - | Horizontal scroll position. Only used if no selector or direction provided. |y
| | number | No* | - | Vertical scroll position. Only used if no selector or direction provided. |behavior
| | string | No | 'smooth' | Scroll behavior: 'smooth' or 'instant'. |
\* At least one of selector, direction, or x/y must be provided.
The tool returns different result objects depending on the scroll mode:
Element Scrolling Result:
`javascript`
{
success: true,
scrollType: "element",
selector: "#contact-section",
elementPosition: {
top: 1200, // Element's position after scroll
left: 0
}
}
Relative Scrolling Result:
`javascript`
{
success: true,
scrollType: "relative",
direction: "down",
amount: 300,
scrolledFrom: { y: 500 }, // Starting scroll position
scrolledTo: { y: 800 }, // Final scroll position
atTop: false, // True if scrolled to top
atBottom: false // True if scrolled to bottom
}
Absolute Position Scrolling Result:
`javascript`
{
success: true,
scrollType: "position",
scrolledTo: {
x: 0,
y: 1000
}
}
Error Result:
`javascript`
{
success: false,
error: "Element not found: #missing-element"
}
The tool checks parameters in this order:
1. selector - If provided, scrolls to element (ignores direction and x/y)direction
2. - If no selector, checks for direction: 'up' or 'down' (ignores x/y)x
3. /y - If no selector or direction, uses absolute position scrolling
Scroll to specific section:
`javascript`
{ selector: "#pricing", position: "top" }
Scroll down to see more content:
`javascript`
{ direction: "down", amount: 500 }
Scroll up to previous content:
`javascript`
{ direction: "up", amount: 300 }
Scroll to top of page:
`javascript`
{ y: 0, behavior: "instant" }
Scroll to bottom of page:
`javascript`
{ y: document.documentElement.scrollHeight, behavior: "smooth" }
- Boundary Protection: Relative scrolling automatically prevents scrolling beyond page boundaries
- Backward Compatible: Existing code using selector or x/y continues to work unchangedbehavior: "instant"
- Smooth Scrolling: Default behavior is smooth scrolling (500ms wait time). Use for immediate scrolling (100ms wait time)
- Element Not Found: If selector doesn't match any element, returns error without scrolling
See the examples/ directory for complete usage examples:
- test-text-chat.html - TTP Chat Widget with customizable settingstest-signed-link.html
- - Widget with signed link authenticationreact-example.jsx
- - React component usagevanilla-example.html
- - Vanilla JavaScript usage
`bashInstall dependencies
npm install
Browser Support
- Chrome 66+
- Firefox 60+
- Safari 11.1+
- Edge 79+
License
MIT
Backend SDK (Java)
For server-side applications, phone system integration, or backend processing, see the Java SDK documentation.
Key Features:
- ✅ Format negotiation (Protocol v2)
- ✅ Raw audio pass-through (PCMU/PCMA for phone systems)
- ✅ No audio decoding (perfect for forwarding to phone systems)
- ✅ Event-driven API
Quick Start:
`java
VoiceSDKConfig config = new VoiceSDKConfig();
config.setWebsocketUrl("wss://speech.talktopc.com/ws/conv?agentId=xxx&appId=yyy");
config.setOutputEncoding("pcmu"); // For phone systems
config.setOutputSampleRate(8000);VoiceSDK sdk = new VoiceSDK(config);
sdk.onAudioData(audioData -> {
// Forward raw PCMU to phone system
phoneSystem.sendAudio(audioData);
});
sdk.connect();
``See java-sdk/README.md for full documentation.
For support and questions, please open an issue on GitHub or contact our support team.