A free, open source library to embed powerful AI models directly into your web applications. Run AI in the front-end with complete data privacy and zero hosting costs. Compatible with major front-end frameworks like React, Next.js, Vue, etc.


Run AI models directly in your users' browsers with zero server-side infrastructure.
š Documentation | š® Playground | š¤ Models Hub | šØš»āš» Discord Community
!test
Axols WebAI.js is an open-source library that enables client-side AI inference directly in the browser. Built on top of Transformers.js, WebGPU, and ONNX Runtime, it eliminates the need for server-side AI model hosting and inference infrastructure.
- šŖ ONNX Model Hub: Access a curated pool of browser-optimized ONNX AI models
- š Pure Client-Side: Run AI models entirely in the browser
- š Privacy-First: Data never leaves the user's device
- š¦ Zero Backend Costs: No server infrastructure needed
- š Easy Setup: No headaches with packages - just one simple installation
- šÆ Standardized API: Same interface across all models
- š Streaming Support: Real-time generation with streaming
- š ļø Framework Compatible: Works with React, Vue, Angular, Next.js, and more
``bash`
npm install @axols/webai-js
All our Web AI models are standardized to use the same 3-step API interface:
`javascript
import { WebAI } from '@axols/webai-js';
// Step 1: Create a WebAI instance
const webai = await WebAI.create({
modelId: "llama-3.2-1b-instruct"
});
// Step 2: Initialize (downloads and loads the model)
await webai.init({
mode: "auto", // Automatically selects best configuration
onDownloadProgress: (progress) => {
console.log(Download progress: ${progress.progress}%);
}
});
// Step 3: Generate
const result = await webai.generate({
userInput: {
messages: [
{
role: "user",
content: "What is the history of AI?"
},
],
}
});
console.log(result);
// Step 4: Clean up when done
webai.terminate();
`
1. Create: Instantiate a WebAI object with a model ID
2. Initialize: Download (if needed) and load the model into memory
3. Generate: Run inference on user input
4. Terminate: Clean up resources when finished
Let WebAI automatically determine the best configuration based on device capabilities:
`javascript`
await webai.init({
mode: "auto",
onDownloadProgress: (progress) => console.log(progress)
});
Control fallback behavior with custom priority configurations:
`javascript`
await webai.init({
mode: "auto",
priorities: [
{ mode: "webai", precision: "q4", device: "webgpu" },
{ mode: "webai", precision: "q8", device: "webgpu" },
{ mode: "webai", precision: "q4", device: "wasm" },
{ mode: "cloud", precision: "", device: "" }
]
});
For models that support streaming, provide real-time results:
`javascript``
const generation = await webai.generateStream({
userInput: "Tell me a story",
onStream: (chunk) => {
console.log(chunk); // Process each chunk as it arrives
}
});
š For detailed API reference and usage examples, see the model-specific documentation
- ā
Always wrap WebAI code in try/catch blocks
- ā
Implement progress indicators during downloads
- ā
Terminate instances when no longer needed
- ā
Monitor device storage and memory usage
- ā
Use streaming for better UX with long generations
- ā
Test on target devices for performance validation
WebAI.js works in all modern browsers that support:
- WebAssembly
- Web Workers
- WebGPU (recommended for best performance)
Join our growing community of developers building with WebAI.js!
- š¬ Discord - Get help, share projects, understand AI trends, and help shape the future of Web AI
- š” Discussions - Share ideas and feature requests
We welcome contributions! You're invited to add more AI models to our platform and contribute to the library.
šØš»āš» We are currently working on our Contributing Guide. In the meantime, feel free to join our Discord to discuss how you can contribute!
š For model-specific issues, bugs, or feature requests, please visit the model issues page.
Apache 2.0 - see LICENSE file for details
- Homepage
- Documentation
- Model Hub
- Playground
- Examples
- Discord Community
Built with:
- Transformers.js
- ONNX Runtime
- WebGPU
---
Made with ā¤ļø by Peng Zhang