TypeScript SDK for Sarvam Conversational AI
npm install sarvam-conv-ai-sdkTypeScript SDK for building real-time voice-to-voice and text-based conversational AI applications across multiple platforms.
- Real-time voice-to-voice conversations in the browser
- Text-based chat with streaming responses
- Automatic microphone capture and speaker playback
- Multi-language support (11 Indian languages + English)
- WebSocket-based real-time communication
- Cross-platform: Browser, React Native, and Node.js support
``bash`
npm install sarvam-conv-ai-sdk
`bash`
npm install sarvam-conv-ai-sdk
npm install react-native-audio-api
`bash`
npm install sarvam-conv-ai-sdk ws
⚠️ Important: Always use platform-specific imports to avoid bundling errors and reduce bundle size.
The SDK provides platform-optimized entry points:
`typescript`
// ✅ Always use the /browser entry point for web applications
import { ConversationAgent, BrowserAudioInterface } from 'sarvam-conv-ai-sdk/browser';
Why? The browser entry point excludes React Native dependencies, preventing bundler errors like Cannot resolve 'react-native'.
`typescript`
// ✅ Always use the /react-native entry point for React Native apps
import { ConversationAgent, RNAudioInterface } from 'sarvam-conv-ai-sdk/react-native';
Why? The React Native entry point includes native module support for iOS and Android.
`typescript`
// Use the default entry point for Node.js
import { ConversationAgent } from 'sarvam-conv-ai-sdk';
`typescript
import React, { useRef, useState } from 'react';
import {
ConversationAgent,
BrowserAudioInterface,
InteractionType,
type ServerTextMsgType,
} from 'sarvam-conv-ai-sdk/browser';
function VoiceChat() {
const [isConnected, setIsConnected] = useState(false);
const [transcript, setTranscript] = useState('');
const agentRef = useRef
const startConversation = async () => {
try {
const audioInterface = new BrowserAudioInterface();
const agent = new ConversationAgent({
apiKey: 'your_api_key',
platform: 'browser',
config: {
user_identifier_type: 'custom',
user_identifier: 'user123',
org_id: 'your_org_id',
workspace_id: 'your_workspace_id',
app_id: 'your_app_id',
interaction_type: InteractionType.CALL,
sample_rate: 16000,
},
audioInterface,
textCallback: async (msg: ServerTextMsgType) => {
setTranscript(prev => prev + msg.text);
},
startCallback: async () => {
setIsConnected(true);
},
endCallback: async () => {
setIsConnected(false);
},
});
agentRef.current = agent;
await agent.start();
await agent.waitForConnect(10);
} catch (error) {
console.error('Error:', error);
}
};
const stopConversation = async () => {
if (agentRef.current) {
await agentRef.current.stop();
agentRef.current = null;
}
};
return (
export default VoiceChat;
`
`javascript
const { ConversationAgent, InteractionType } = require('sarvam-conv-ai-sdk');
async function main() {
const agent = new ConversationAgent({
apiKey: 'your_api_key',
config: {
org_id: 'your_org_id',
workspace_id: 'your_workspace_id',
app_id: 'your_app_id',
user_identifier: 'user@example.com',
user_identifier_type: 'email',
interaction_type: InteractionType.TEXT,
sample_rate: 16000,
},
textCallback: async (msg) => {
console.log('Agent:', msg.text);
},
startCallback: async () => {
console.log('Conversation started!');
},
});
await agent.start();
const connected = await agent.waitForConnect(10);
if (connected) {
await agent.sendText('Hello, how are you?');
await agent.waitForDisconnect();
}
}
main().catch(console.error);
`
The main class for managing conversational AI sessions.
#### Constructor Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| apiKey | string | Yes | API key for authentication |
| config | InteractionConfig | Yes | Interaction configuration |
| platform | 'browser' \| 'node' | No | Platform type (auto-detected) |
| audioInterface | AsyncAudioInterface | No | Audio interface for voice interactions |
| textCallback | (msg: ServerTextMsgType) => Promise\
| audioCallback | (msg: ServerAudioChunkMsg) => Promise\
| eventCallback | (event: ServerEventBase) => Promise\
| startCallback | () => Promise\
| endCallback | () => Promise\
| baseUrl | string | No | Override base URL |
#### Methods
- async start() - Start the conversation sessionasync stop()
- - Stop the conversation and cleanupasync waitForConnect(timeout?)
- - Wait for connection (returns boolean)async waitForDisconnect()
- - Wait until disconnectedisConnected()
- - Check connection statusgetInteractionId()
- - Get current interaction IDasync sendAudio(audioData)
- - Send raw audio (voice mode only)async sendText(text)
- - Send text message (text mode only)getAgentType()
- - Get agent type ('voice' or 'text')
#### Required Fields
| Field | Type | Description |
| --- | --- | --- |
| user_identifier_type | string | One of: 'custom', 'email', 'phone_number', 'unknown' |
| user_identifier | string | User identifier value |
| org_id | string | Your organization ID |
| workspace_id | string | Your workspace ID |
| app_id | string | The target application ID |
| interaction_type | InteractionType | InteractionType.CALL or InteractionType.TEXT |
| sample_rate | number | Audio sample rate: 8000, 16000, or 22000 |
#### Optional Fields
| Field | Type | Description |
| --- | --- | --- |
| version | number | App version (uses latest if not provided) |
| agent_variables | Record\
| initial_language_name | SarvamToolLanguageName | Starting language |
| initial_state_name | string | Starting state name |
| initial_bot_message | string | First message from agent |
Handles microphone capture and speaker playback in browser environments.
`typescript
import { BrowserAudioInterface } from 'sarvam-conv-ai-sdk';
const audioInterface = new BrowserAudioInterface();
`
Features:
- Automatic microphone access and audio capture
- Real-time audio streaming at 16kHz
- Automatic speaker playback
- Handles user interruptions
Requirements:
- HTTPS connection (required for microphone access)
- Modern browser with WebAudio API support
- User permission for microphone access
Receives streaming text chunks from the agent:
`typescript`
textCallback: async (msg: ServerTextMsgType) => {
console.log('Agent says:', msg.text);
}
Receives various events during conversation:
`typescript`
eventCallback: async (event: ServerEventBase) => {
switch (event.type) {
case 'server.action.interaction_connected':
console.log('Connected');
break;
case 'server.event.user_interrupt':
console.log('User interrupted');
break;
case 'server.action.interaction_end':
console.log('Conversation ended');
break;
case 'server.event.user_speech_start':
console.log('User started speaking');
break;
case 'server.event.user_speech_end':
console.log('User stopped speaking');
break;
}
}
The SDK supports 11 Indian languages plus English:
`typescript
import { SarvamToolLanguageName } from 'sarvam-conv-ai-sdk';
// Available: BENGALI, GUJARATI, KANNADA, MALAYALAM, TAMIL,
// TELUGU, PUNJABI, ODIA, MARATHI, HINDI, ENGLISH
const config = {
initial_language_name: SarvamToolLanguageName.HINDI,
};
`
Resource Cleanup: Always cleanup resources when component unmounts
`typescript`
useEffect(() => {
return () => agentRef.current?.stop().catch(console.error);
}, []);
Connection Timeout: Specify timeout when waiting for connection
`typescript`
const connected = await agent.waitForConnect(10); // 10 seconds
if (!connected) console.error('Connection timeout');
Error Handling: Wrap agent operations in try-catch blocks
`typescript`
try {
await agent.start();
await agent.waitForConnect(10);
} catch (error) {
console.error('Error:', error);
await agent.stop();
}
Secure API Keys: Use environment variables or backend proxy
`typescript
// Use environment variables
const apiKey = import.meta.env.VITE_SARVAM_API_KEY;
// Or use backend proxy
const agent = new ConversationAgent({ baseUrl: '/api/proxy/' });
`
- Web Example - See examples/web for a complete React + TypeScript applicationexamples/nodejs/simple-text-chat.js` for a command-line text chat
- Node.js Example - See
Microphone Not Working: Ensure HTTPS connection, check browser permissions, verify microphone is not in use by another app
Connection Timeout: Check network connectivity, verify API key is valid, ensure app_id exists and has a committed version
Audio Quality Issues: Verify sample rate matches configuration (8000, 16000, or 22000), ensure audio format is LINEAR16 (16-bit PCM mono)
MIT