Transform any React app into a voice-controllable interface
npm install @atik9157/aiui-react-sdkbash
npm install @atik9157/aiui-react-sdk
`
$3
Important: You must add two audio worklet processor files to your public folder for voice functionality:
#### 1. Create public/player-processor.js:
`javascript
/ Audio playback worklet /
class PlayerProcessor extends AudioWorkletProcessor {
constructor() {
super();
this.queue = [];
this.offset = 0;
this.port.onmessage = e => this.queue.push(e.data);
}
process(_, outputs) {
const out = outputs[0][0];
let idx = 0;
while (idx < out.length) {
if (!this.queue.length) {
out.fill(0, idx);
break;
}
const buf = this.queue[0];
const copy = Math.min(buf.length - this.offset, out.length - idx);
out.set(buf.subarray(this.offset, this.offset + copy), idx);
idx += copy;
this.offset += copy;
if (this.offset >= buf.length) {
this.queue.shift();
this.offset = 0;
}
}
return true;
}
}
registerProcessor('player-processor', PlayerProcessor);
`
#### 2. Create public/worklet-processor.js:
`javascript
/ Microphone worklet - captures and downsamples to 16kHz /
class MicProcessor extends AudioWorkletProcessor {
constructor () {
super();
this.dstRate = 16_000;
this.frameMs = 20;
this.srcRate = sampleRate;
this.ratio = this.srcRate / this.dstRate;
this.samplesPerPacket = Math.round(this.dstRate * this.frameMs / 1_000);
this.packet = new Int16Array(this.samplesPerPacket);
this.pIndex = 0;
this.acc = 0;
this.seq = 0;
}
process (inputs) {
const input = inputs[0];
if (!input || !input[0]?.length) return true;
const ch = input[0];
for (let i = 0; i < ch.length; i++) {
this.acc += 1;
if (this.acc >= this.ratio) {
const s = Math.max(-1, Math.min(1, ch[i]));
this.packet[this.pIndex++] = s < 0 ? s 32768 : s 32767;
this.acc -= this.ratio;
if (this.pIndex === this.packet.length) {
this.port.postMessage(this.packet.buffer, [this.packet.buffer]);
this.packet = new Int16Array(this.samplesPerPacket);
this.pIndex = 0;
this.seq++;
}
}
}
return true;
}
}
registerProcessor("mic-processor", MicProcessor);
`
Your project structure should look like:
`
your-app/
āāā public/
ā āāā player-processor.js ā Required for audio playback
ā āāā worklet-processor.js ā Required for microphone input
ā āāā index.html
āāā src/
ā āāā App.tsx
āāā package.json
`
> ā ļø Note: These worklet files must be in the public folder and served at /player-processor.js and /worklet-processor.js URLs. The SDK loads them at runtime for audio processing.
š Quick Start
$3
`tsx
import { AIUIProvider } from '@atik9157/aiui-react-sdk';
import type { AIUIConfig } from '@atik9157/aiui-react-sdk';
const config: AIUIConfig = {
applicationId: 'my-awesome-app',
serverUrl: 'wss://your-aiui-server.com',
apiKey: 'your-api-key', // Optional
pages: [
{
route: '/',
title: 'Home',
safeActions: ['click', 'set_value'],
},
{
route: '/dashboard',
title: 'Dashboard',
}
]
};
function App() {
return (
);
}
`
$3
`tsx
import { useAIUI } from '@atik9157/aiui-react-sdk';
function VoiceControlButton() {
const { isConnected, isListening, startListening, stopListening } = useAIUI();
return (
Status: {isConnected ? 'š¢ Connected' : 'š“ Disconnected'}
);
}
`
$3
Users can now say things like:
- "Click the submit button"
- "Fill in the email field with john@example.com"
- "Navigate to the dashboard page"
- "Select Marketing and Sales from the categories dropdown"
š Configuration
$3
| Property | Type | Required | Description |
|----------|------|----------|-------------|
| applicationId | string | ā
| Unique identifier for your application |
| serverUrl | string | ā
| WebSocket URL of your AIUI server |
| apiKey | string | ā | Authentication key for your server |
| pages | MinimalPageConfig[] | ā
| Array of page configurations |
| safetyRules | SafetyRules | ā | Security and safety configurations |
| privacy | PrivacyConfig | ā | Privacy and data filtering rules |
$3
`tsx
interface MinimalPageConfig {
route: string; // Page route (e.g., '/dashboard')
title?: string; // Page title for context
safeActions?: string[]; // Allowed actions on this page
dangerousActions?: string[]; // Actions requiring confirmation
}
`
$3
Protect your users by restricting dangerous actions and sensitive areas:
`tsx
safetyRules: {
requireConfirmation: ['delete', 'submit_payment', 'purchase'],
blockedSelectors: ['.admin-only', '[data-sensitive]'],
allowedDomains: ['yourapp.com', 'api.yourapp.com']
}
`
$3
Automatically redact sensitive information from context:
`tsx
privacy: {
exposePasswords: false,
exposeCreditCards: false,
redactPatterns: ['ssn', 'social-security', 'credit-card']
}
`
šÆ Supported Actions
The SDK automatically detects and enables these actions on interactive elements:
| Action | Description | Example Voice Command |
|--------|-------------|----------------------|
| click | Click buttons, links, and interactive elements | "Click the submit button" |
| set_value | Set input/textarea values | "Set email to john@example.com" |
| select_from_dropdown | Select options from dropdowns | "Select Design and Marketing" |
| toggle | Toggle checkboxes | "Toggle the remember me checkbox" |
| navigate | Navigate between routes | "Navigate to dashboard" |
š§ Advanced Usage
$3
Execute actions programmatically without voice:
`tsx
import { useAIUI } from '@atik9157/aiui-react-sdk';
function MyComponent() {
const { executeAction } = useAIUI();
const handleCustomAction = async () => {
try {
const result = await executeAction('click', {
semantic: 'submit button'
});
if (result.success) {
console.log('Action executed successfully!');
}
} catch (error) {
console.error('Action failed:', error);
}
};
return ;
}
`
$3
When multiple identical elements exist (e.g., multiple "Delete" buttons), use index notation:
`tsx
// Using number notation
await executeAction('click', { semantic: 'Delete button #3' });
// Using ordinal words
await executeAction('click', { semantic: 'second delete button' });
`
Voice commands work the same way:
- "Click the third delete button"
- "Click delete button number 2"
$3
For React Select or custom dropdowns, add semantic attributes to enable voice selection:
`tsx
data-select-field="project-categories"
data-select-options="Design|||Development|||Marketing|||Sales"
placeholder="Select categories"
/>
`
Then users can say:
- "Select Design and Marketing from project categories"
- "Choose Development and Sales"
$3
Read values from discovered elements:
`tsx
const { getComponentValue } = useAIUI();
const emailValue = getComponentValue('#email-input');
console.log('Email:', emailValue);
`
$3
Track when the UI context changes:
`tsx
const { currentPage } = useAIUI();
useEffect(() => {
console.log('Current page:', currentPage);
}, [currentPage]);
`
šØ TypeScript Support
Full TypeScript definitions are included:
`tsx
import type {
AIUIConfig,
MinimalPageConfig,
SafetyRules,
PrivacyConfig,
ActionType,
ActionParams,
ActionResult,
ContextUpdate,
DiscoveredElementState
} from '@atik9157/aiui-react-sdk';
`
š„ļø Development Mode
The SDK includes a development overlay showing real-time connection status:
`tsx
// Automatically shown when NODE_ENV === 'development'
// Displays:
// - AIUI connection status (š¢/š“)
// - Microphone status (š¤/š)
// - Current page route
`
The overlay appears in the bottom-right corner and helps you debug connectivity issues during development.
š Examples
$3
`tsx
import { AIUIProvider, useAIUI } from '@atik9157/aiui-react-sdk';
const config = {
applicationId: 'ecommerce-store',
serverUrl: 'wss://aiui.mystore.com',
pages: [
{ route: '/', title: 'Home' },
{ route: '/products', title: 'Products' },
{ route: '/cart', title: 'Shopping Cart' },
{ route: '/checkout', title: 'Checkout' }
],
safetyRules: {
requireConfirmation: ['place_order', 'delete_account'],
blockedSelectors: ['.admin-panel']
},
privacy: {
exposePasswords: false,
exposeCreditCards: false
}
};
function App() {
return (
);
}
`
Voice commands your users can use:
- "Add the blue shirt to cart"
- "Navigate to checkout"
- "Fill shipping address with 123 Main St"
- "Select express shipping"
$3
`tsx
data-select-field="categories"
data-select-options="Frontend|||Backend|||DevOps|||Design|||Marketing"
placeholder="Select categories..."
/>
data-select-field="status"
data-select-options="Active|||Pending|||Completed|||Archived"
placeholder="Select status..."
/>
`
Voice commands:
- "Select Frontend and Backend from categories"
- "Choose Active from status"
- "Select Design, Marketing and DevOps"
$3
`tsx
`
Voice commands:
- "Set full name to John Smith"
- "Fill email with john@example.com"
- "Set message to I would like more information"
- "Click submit form"
šļø Architecture Overview
The SDK is the browser-side component that communicates with your backend server:
`
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā Your React Application (Client) ā
ā (@atik9157/aiui-react-sdk) ā
ā ā
ā ā Discovers interactive elements automatically ā
ā ā Streams UI context to your server ā
ā ā Executes actions from server commands ā
ā ā Handles voice audio I/O ā
āāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā
ā AIUI Protocol (WebSocket)
ā ⢠Real-time context updates
ā ⢠Action commands
ā ⢠Bidirectional audio
ā
ā¼
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā Your AIUI Server (Backend) ā
ā (You need to implement this) ā
ā ā
ā ā Receives UI context from SDK ā
ā ā Processes voice with STT (Speech-to-Text) ā
ā ā Uses AI to understand commands and context ā
ā ā Sends action commands back to SDK ā
ā ā Generates voice responses with TTS ā
ā ā
ā Works with: OpenAI, Claude, Gemini, Ollama, etc. ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
`
š Backend Server Requirements
Your backend server needs to implement the AIUI Protocol - a WebSocket-based protocol for real-time UI control. Here's what you need:
$3
1. /context endpoint - For UI context and action commands
- Receives UI state updates from the SDK
- Sends action commands back to SDK
- Query params: applicationId (required), apiKey (optional)
2. /audio endpoint - For voice interaction
- Receives microphone audio: 16kHz PCM Int16Array
- Sends playback audio: 24kHz PCM Int16Array
$3
context_update - Complete UI state
`javascript
{
type: 'context_update',
page: { route: '/dashboard', title: 'Dashboard' },
elements: [
{
semantic: 'Submit button',
type: 'button',
actions: ['click'],
selector: 'button.submit:nth-of-type(1)'
}
// ... more elements
],
viewport: { width: 1920, height: 1080 }
}
`
context_append - New elements added (e.g., modal opened)
`javascript
{
type: 'context_append',
elements: [/ only new elements /]
}
`
$3
action - Command SDK to perform action
`javascript
{
type: 'action',
action: 'click',
params: { semantic: 'submit button' }
}
`
$3
`javascript
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });
wss.on('connection', (ws, req) => {
const path = new URL(req.url, 'ws://localhost').pathname;
if (path === '/context') {
ws.on('message', async (data) => {
const msg = JSON.parse(data);
if (msg.type === 'context_update') {
// 1. Get UI context
const context = msg.context;
// 2. Process with your AI (OpenAI, Claude, etc.)
const action = await processWithAI(context, userCommand);
// 3. Send action back to SDK
ws.send(JSON.stringify({
type: 'action',
action: 'click',
params: { semantic: 'submit button' }
}));
}
});
}
});
`
For complete protocol documentation and server examples, see the AIUI Protocol Specification.
š Browser Compatibility
| Browser | Status | Notes |
|---------|--------|-------|
| Chrome/Edge | ā
Full support | Recommended |
| Firefox | ā
Full support | |
| Safari | ā
Full support | Uses webkit prefix for AudioContext |
| Mobile browsers | ā
Supported | Microphone permissions required |
Minimum Requirements:
- Modern browser with WebSocket support
- Web Audio API support
- Microphone access (for voice features)
š¤ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (git checkout -b feature/AmazingFeature)
3. Commit your changes (git commit -m 'Add some AmazingFeature')
4. Push to the branch (git push origin feature/AmazingFeature`)