Recognition Service TypeScript/Node.js Client SDK (Node 18 compatible build)
npm install @volley/recognition-client-sdk-node18TypeScript SDK for real-time speech recognition via WebSocket.
``bash`
npm install @volley/recognition-client-sdk
`typescript
import {
createClientWithBuilder,
RecognitionProvider,
DeepgramModel,
STAGES
} from '@volley/recognition-client-sdk';
// Create client with builder pattern (recommended)
const client = createClientWithBuilder(builder =>
builder
.stage(STAGES.STAGING) // ✨ Simple environment selection using enum
.provider(RecognitionProvider.DEEPGRAM)
.model(DeepgramModel.NOVA_2)
.onTranscript(result => {
console.log('Final:', result.finalTranscript);
console.log('Interim:', result.pendingTranscript);
})
.onError(error => console.error(error))
);
// Stream audio
await client.connect();
client.sendAudio(pcm16AudioChunk); // Call repeatedly with audio chunks
await client.stopRecording(); // Wait for final transcript
// Check the actual URL being used
console.log('Connected to:', client.getUrl());
`
`typescript
import {
RealTimeTwoWayWebSocketRecognitionClient,
RecognitionProvider,
DeepgramModel,
Language,
STAGES
} from '@volley/recognition-client-sdk';
const client = new RealTimeTwoWayWebSocketRecognitionClient({
stage: STAGES.STAGING, // ✨ Recommended: Use STAGES enum for type safety
asrRequestConfig: {
provider: RecognitionProvider.DEEPGRAM,
model: DeepgramModel.NOVA_2,
language: Language.ENGLISH_US
},
onTranscript: (result) => console.log(result),
onError: (error) => console.error(error)
});
// Check the actual URL being used
console.log('Connected to:', client.getUrl());
`
Recommended: Use stage parameter with STAGES enum for automatic environment configuration:
`typescript
import {
RecognitionProvider,
DeepgramModel,
Language,
STAGES
} from '@volley/recognition-client-sdk';
builder
.stage(STAGES.STAGING) // STAGES.LOCAL | STAGES.DEV | STAGES.STAGING | STAGES.PRODUCTION
.provider(RecognitionProvider.DEEPGRAM) // DEEPGRAM, GOOGLE
.model(DeepgramModel.NOVA_2) // Provider-specific model enum
.language(Language.ENGLISH_US) // Language enum
.interimResults(true) // Enable partial transcripts
`
Available Stages and URLs:
| Stage | Enum | WebSocket URL |
|-------|------|---------------|
| Local | STAGES.LOCAL | ws://localhost:3101/ws/v1/recognize |STAGES.DEV
| Development | | wss://recognition-service-dev.volley-services.net/ws/v1/recognize |STAGES.STAGING
| Staging | | wss://recognition-service-staging.volley-services.net/ws/v1/recognize |STAGES.PRODUCTION
| Production | | wss://recognition-service.volley-services.net/ws/v1/recognize |
> 💡 Using the stage parameter automatically constructs the correct URL for each environment.
Automatic Connection Retry:
The SDK automatically retries failed connections with sensible defaults - no configuration needed!
Default behavior (works out of the box):
- 4 connection attempts (try once, retry 3 times if failed)
- 200ms delay between retries
- Handles temporary service unavailability (503)
- Fast failure (~600ms total on complete failure)
- Timing: Attempt 1 → FAIL → wait 200ms → Attempt 2 → FAIL → wait 200ms → Attempt 3 → FAIL → wait 200ms → Attempt 4
`typescript
import { STAGES } from '@volley/recognition-client-sdk';
// ✅ Automatic retry - no config needed!
const client = new RealTimeTwoWayWebSocketRecognitionClient({
stage: STAGES.STAGING,
// connectionRetry works automatically with defaults
});
`
Optional: Customize retry behavior (only if needed):
`typescript`
const client = new RealTimeTwoWayWebSocketRecognitionClient({
stage: STAGES.STAGING,
connectionRetry: {
maxAttempts: 2, // Fewer attempts (min: 1, max: 5)
delayMs: 500 // Longer delay between attempts
}
});
> ⚠️ Note: Retry only applies to initial connection establishment. If the connection drops during audio streaming, the SDK will not auto-retry (caller must handle this).
Advanced: Custom URL for non-standard endpoints:
`typescript`
builder
.url('wss://custom-endpoint.example.com/ws/v1/recognize') // Custom WebSocket URL
.provider(RecognitionProvider.DEEPGRAM)
// ... rest of config
> 💡 Note: If both stage and url are provided, url takes precedence.
`typescript`
builder
.onTranscript(result => {}) // Handle transcription results
.onError(error => {}) // Handle errors
.onConnected(() => {}) // Connection established
.onDisconnected((code) => {}) // Connection closed
.onMetadata(meta => {}) // Timing information
`typescript`
builder
.gameContext({ // Context for better recognition
gameId: 'session-123',
prompt: 'Expected responses: yes, no, maybe'
})
.userId('user-123') // User identification
.platform('web') // Platform identifier
.logger((level, msg, data) => {}) // Custom logging
`typescript`
await client.connect(); // Establish connection
client.sendAudio(chunk); // Send PCM16 audio
await client.stopRecording(); // End and get final transcript
client.getAudioUtteranceId(); // Get session UUID
client.getUrl(); // Get actual WebSocket URL being used
client.getState(); // Get current state
client.isConnected(); // Check connection status
`typescript`
{
type: 'Transcription'; // Message type discriminator
audioUtteranceId: string; // Session UUID
finalTranscript: string; // Confirmed text (won't change)
finalTranscriptConfidence?: number; // Confidence 0-1 for final transcript
pendingTranscript?: string; // In-progress text (may change)
pendingTranscriptConfidence?: number; // Confidence 0-1 for pending transcript
is_finished: boolean; // Transcription complete (last message)
voiceStart?: number; // Voice activity start time (ms from stream start)
voiceDuration?: number; // Voice duration (ms)
voiceEnd?: number; // Voice activity end time (ms from stream start)
startTimestamp?: number; // Transcription start timestamp (ms)
endTimestamp?: number; // Transcription end timestamp (ms)
receivedAtMs?: number; // Server receive timestamp (ms since epoch)
accumulatedAudioTimeMs?: number; // Total audio duration sent (ms)
}
`typescript
import { RecognitionProvider, DeepgramModel } from '@volley/recognition-client-sdk';
builder
.provider(RecognitionProvider.DEEPGRAM)
.model(DeepgramModel.NOVA_2); // NOVA_2, NOVA_3, FLUX_GENERAL_EN
`
`typescript
import { RecognitionProvider, GoogleModel } from '@volley/recognition-client-sdk';
builder
.provider(RecognitionProvider.GOOGLE)
.model(GoogleModel.LATEST_SHORT); // LATEST_SHORT, LATEST_LONG, TELEPHONY, etc.
`
Available Google models:
- LATEST_SHORT - Optimized for short audio (< 1 minute)LATEST_LONG
- - Optimized for long audio (> 1 minute)TELEPHONY
- - Optimized for phone audioTELEPHONY_SHORT
- - Short telephony audioMEDICAL_DICTATION
- - Medical dictation (premium)MEDICAL_CONVERSATION
- - Medical conversations (premium)
The SDK expects PCM16 audio:
- Format: Linear PCM (16-bit signed integers)
- Sample Rate: 16kHz recommended
- Channels: Mono
Please reach out to AI team if ther are essential reasons that we need other formats.
`typescriptError ${error.code}: ${error.message}
builder.onError(error => {
console.error();
});
// Check disconnection type
import { isNormalDisconnection } from '@volley/recognition-client-sdk';
builder.onDisconnected((code, reason) => {
if (!isNormalDisconnection(code)) {
console.error('Unexpected disconnect:', code);
}
});
`
WebSocket fails to connect
- Verify the recognition service is running
- Check the WebSocket URL format: ws:// or wss://
- Ensure network allows WebSocket connections
Authentication errors
- Verify audioUtteranceId is provided
- Check if service requires additional auth headers
No transcription results
- Confirm audio format is PCM16, 16kHz, mono
- Check if audio chunks are being sent (use onAudioSent callback)
- Verify audio data is not empty or corrupted
Poor transcription quality
- Try different models (e.g., NOVA_2 vs NOVA_2_GENERAL)
- Adjust language setting to match audio
- Ensure audio sample rate matches configuration
High latency
- Use smaller audio chunks (e.g., 100ms instead of 500ms)
- Choose a model optimized for real-time (e.g., Deepgram Nova 2)
- Check network latency to service
Memory issues
- Call disconnect() when done to clean up resources
- Avoid keeping multiple client instances active
This package uses automated publishing via semantic-release with npm Trusted Publishers (OIDC).
After the first manual publish, configure npm Trusted Publishers:
1. Go to https://www.npmjs.com/package/@volley/recognition-client-sdk/access
2. Click "Add publisher" → Select "GitHub Actions"
3. Configure:
- Organization: Volley-Increcognition-service
- Repository: sdk-release.yml
- Workflow:
- Environment: Leave empty (not required)
- Automated releases: Push to dev branch triggers semantic-release
- Version bumping: Based on conventional commits (feat/fix/BREAKING CHANGE)
- No tokens needed: Uses OIDC authentication with npm
- Provenance: Automatic supply chain attestation
- Path filtering: Only releases when SDK or libs change
If needed for testing:
`bash`
cd packages/client-sdk-ts
npm login --scope=@volley
pnpm build
npm publish --provenance --access public
This SDK is part of the Recognition Service monorepo. To contribute:
1. Make changes to SDK or libs
2. Test locally with pnpm testdev
3. Create PR to branch with conventional commit messages (feat:, fix:`, etc.)
4. After merge, automated workflow will publish new version to npm
Proprietary