Cross-platform voice-to-text dictation for Linux and macOS with GPU acceleration (NVIDIA CUDA/CoreML), Secretary Mode (60+ natural language commands), Context-Aware Meta-Learning, and pure Rust performance. Meta-package that automatically installs platfor
npm install swictationCommand-line interface for managing the Swictation voice dictation daemon.
NEW in v0.4.0: Swictation now fully supports Wayland-based Linux desktops with automated installation:
- ✅ GNOME Wayland (Ubuntu 25.10+) - Hotkeys auto-configured
- ✅ Sway - Text injection auto-setup
- ✅ GPU Acceleration - CUDA libraries auto-downloaded
- ✅ Service Management - Auto-enabled systemd service
One command installation:
``bash`
npm install -g swictation --foreground-scriptsEverything is configured automatically!
→ Complete Wayland Support Guide
- 🎤 Real-time voice transcription using Parakeet-TDT-1.1B (NVIDIA)
- 📝 Secretary Mode - 60+ natural language commands for dictation
- Say "comma" → Get ","
- Say "mr smith said quote hello quote" → Get "Mr. Smith said 'Hello'"
- Automatic capitalization, punctuation, numbers, quotes, formatting
- → Full Secretary Mode Guide
- 🔄 Smart text transformation - MidStream Rust library (~5µs latency)
- ⚡ Low latency - Pure Rust implementation with CUDA acceleration
- 🖥️ Wayland & X11 Support - Full dual display server support
- 🎯 Hotkey support - Auto-configured on GNOME, manual setup on Sway
- 📊 Real-time metrics - WPM, latency, GPU/CPU usage
- 🦀 Pure Rust daemon - Zero Python runtime dependencies
bash
# Ubuntu 24.04+
sudo apt-get install libasound2t64 # Ubuntu 22.04 and older
sudo apt-get install libasound2
`$3
- NVIDIA GPU: Any CUDA-capable GPU
- CUDA: 12.x or later
- VRAM: 4GB minimum (6GB+ recommended for 1.1B model)
- Driver: NVIDIA proprietary drivers (not nouveau)Note: GPU acceleration libraries (~2.3GB CUDA + cuDNN) are automatically downloaded during installation. These include CUDA runtime and cuDNN libraries optimized for your GPU architecture.
If GPU is not available, Swictation will run in CPU-only mode (slower transcription).
Installation
$3
Recommended: User-Local Installation (No sudo required)
#### If using nvm (Node Version Manager)
nvm already handles user-local installations automatically! No additional setup needed:
`bash
Just install (nvm manages the prefix)
npm install -g swictation --foreground-scripts
`The installation will automatically:
1. Download GPU acceleration libraries (~2.3GB, if NVIDIA GPU detected)
2. Install and configure ydotool (Wayland text injection)
3. Configure GNOME keyboard shortcuts (if on GNOME Wayland)
4. Set up and enable systemd service
5. Start the daemon service
Important: If you previously set a custom npm prefix, remove it:
`bash
npm config delete prefix
`#### If NOT using nvm
Configure npm to install packages to your home directory:
`bash
One-time setup: Configure npm to use user-local directory
mkdir -p ~/.npm-global
npm config set prefix ~/.npm-globalAdd to PATH (add this line to ~/.bashrc or ~/.zshrc)
export PATH="$HOME/.npm-global/bin:$PATH"Source your shell config to apply changes
source ~/.bashrc # or source ~/.zshrcNow install without sudo
npm install -g swictation --foreground-scripts
`$3
`bash
Install with visible output
sudo npm install -g swictation --foreground-scriptsOr standard install (output hidden by npm)
sudo npm install -g swictation
`Why user-local installation is recommended:
- ✅ No sudo required
- ✅ No permission issues
- ✅ Each user has their own installation
- ✅ Cleaner and more secure
- ✅ Works with nvm seamlessly
Note: The
--foreground-scripts flag makes the installation progress visible. The postinstall script will:
1. Download GPU acceleration libraries (~2.3GB CUDA + cuDNN, if NVIDIA GPU detected)
2. Install and configure ydotool (Wayland text injection)
3. Set up uinput kernel module and permissions
4. Configure GNOME keyboard shortcuts (if on GNOME Wayland)
5. Generate systemd service files with GPU library paths
6. Enable and start the daemon service
7. Test GPU model loading (if GPU detected)$3
You can customize the installation with environment variables:
`bash
Default: Standard install with GPU model testing (if GPU detected)
npm install -g swictation --foreground-scriptsSkip model testing for CI/headless environments (faster install)
SKIP_MODEL_TEST=1 npm install -g swictation --foreground-scripts
`Installation behavior:
- Default: Model test-loading runs automatically when GPU is detected (~30s to verify)
- SKIP_MODEL_TEST=1: Skips model testing entirely (useful for CI/automated environments)
Note: Model testing is recommended to ensure your GPU setup works correctly. It verifies that the daemon can load models with your CUDA installation.
Quick Start
GNOME Wayland (Ubuntu 25.10+): Everything is auto-configured! Just start using it:
`bash
Installation already configured everything, just start using:
swictation status # Check daemon is running
Press Super+Shift+D to toggle recording (already configured!)
`Sway/Other Compositors: Add hotkey binding to your config (see Hotkey Configuration section), then:
`bash
swictation status # Check daemon is running
Use your configured hotkey to toggle recording
`Manual Setup (if needed):
`bash
swictation setup # Configure systemd service
swictation start # Start the daemon
swictation toggle # Toggle recording via command
`Optional UI:
`bash
swictation start --ui # Launch with visual interface
`Commands
-
swictation start [--ui] - Start the daemon (and optionally the UI)
- swictation stop - Stop the daemon
- swictation status - Show service status
- swictation toggle - Toggle recording on/off
- swictation setup - Configure systemd and hotkeys
- swictation help - Show help messageDisplay Server Support
$3
- ✅ GNOME Wayland (Ubuntu 25.10+) - Keyboard shortcuts auto-configured
- ✅ Sway - Text injection auto-setup, manual hotkey configuration
- ✅ X11 - Traditional X11 support with xdotool$3
- OS: Linux x64
- Distribution: Ubuntu 24.04 LTS or newer (GLIBC 2.39+)
- ✅ Ubuntu 24.04 LTS (Noble Numbat)
- ✅ Ubuntu 25.10+ (Questing Quetzal) - Recommended for Wayland
- ✅ Debian 13+ (Trixie)
- ✅ Fedora 39+
- ❌ Ubuntu 22.04 LTS - NOT supported (GLIBC 2.35 too old)
- Node.js: 18.0.0 or higher
- Storage: ~12 GB total (AI models + GPU libraries)
- GPU: NVIDIA with 4GB+ VRAM (CUDA 12.x) or CPU-only mode$3
For optimal performance with the 1.1B model (62.8x realtime speed):
- GPU: NVIDIA GPU with 4GB+ VRAM
- CUDA: 11.8+ or 12.x
- Compute Capability: 7.0+ (Turing architecture or newer)
- ONNX Runtime: onnxruntime-gpu 1.16.0+$3
GPU acceleration libraries (CUDA + cuDNN) are automatically downloaded during npm installation. No manual CUDA installation required!$3
Automatically Installed by npm postinstall:
- ydotool (Wayland text injection) - Auto-installed on Wayland systems
- netcat (IPC socket communication) - Auto-installed
Pre-installed on most systems:
- GLIBC 2.39+ (Ubuntu 24.04+)
- libasound2 (ALSA sound library)
- systemd (service management)
- PipeWire or PulseAudio (audio capture)
For X11 systems (optional):
`bash
sudo apt install xdotool # Ubuntu/Debian
sudo pacman -S xdotool # Arch Linux
sudo dnf install xdotool # Fedora
`Configuration
Configuration file is located at
~/.config/swictation/config.toml$3
GNOME Wayland: ✅ Automatically configured during installation!
- Default hotkey:
Super+Shift+D
- Visible in: Settings → Keyboard → View and Customize Shortcuts
- No manual configuration neededSway - Add to
~/.config/sway/config:
`bash
Swictation toggle
bindsym $mod+Shift+d exec swictation toggleOptional: Push-to-talk (hold to record)
bindsym $mod+Space exec swictation ptt-press
bindsym --release $mod+Space exec swictation ptt-release
`
Then reload: swaymsg reloadX11 (i3, etc.) - Add to config:
`bash
bindsym Mod4+Shift+d exec swictation toggle
`→ See Complete Wayland Support Guide for detailed setup and troubleshooting.
Architecture
$3
Swictation consists of three main components:
1. Daemon (
swictation-daemon) - Rust service handling audio capture and transcription
2. UI (swictation-ui) - Tauri application for metrics and control
3. CLI (swictation) - Node.js command-line interface$3
Swictation uses a platform-specific package architecture similar to esbuild and swc:
`
swictation (main package)
├── @agidreams/linux-x64 (Linux x86_64 binaries)
└── @agidreams/darwin-arm64 (macOS Apple Silicon binaries)
`How it works:
1. Automatic Platform Detection: When you run
npm install -g swictation, npm automatically detects your platform (Linux x64 or macOS ARM64)2. Platform Package Installation: npm installs the main
swictation package PLUS the correct platform package for your system via optionalDependencies3. Binary Resolution: The CLI wrapper automatically finds and uses the binaries from your platform package
Example installation flow on Linux:
`bash
npm install -g swictation
npm automatically installs:
- swictation (main package with CLI)
- @agidreams/linux-x64 (platform binaries)
`What's in each package:
| Package | Contents |
|---------|----------|
|
swictation | CLI wrapper, postinstall scripts, configuration |
| @agidreams/linux-x64 | Linux ELF binaries, ONNX Runtime base library |
| @agidreams/darwin-arm64 | macOS Mach-O ARM64 binaries, ONNX Runtime with CoreML |Benefits:
- ✅ Smaller downloads - only your platform's binaries are installed
- ✅ Cleaner architecture - binaries grouped by platform
- ✅ Better CI/CD - each platform built independently
- ✅ Same simple install command -
npm install -g swictation just works$3
Swictation uses semantic versioning with a two-level approach:
1. Distribution Version (user-facing):
0.7.9
- The version you see in npm install swictation@0.7.9
- Synchronized across all three packages (main + both platform packages)
- Single version number for the entire distribution2. Component Versions (internal):
- Daemon:
0.7.5 (Rust binary version)
- UI: 0.1.0 (Tauri application version)
- ONNX Runtime: 1.22.0 (macOS), 1.23.2 (Linux GPU)Version Synchronization:
All package versions are automatically synchronized via
npm-package/versions.json:`json
{
"distribution": "0.7.9",
"components": {
"daemon": { "version": "0.7.5" },
"ui": { "version": "0.1.0" }
}
}
`Why separate versions?
- Components evolve at different rates (UI changes rarely, daemon changes often)
- Users see ONE simple version number for the distribution
- Developers track component versions internally for debugging
- Platform packages use distribution version for npm publishing
Checking versions:
`bash
swictation --version
Shows distribution version: 0.7.9
swictation-daemon --version
Shows daemon component version: 0.7.5
`Verification
After installation, verify everything is working:
`bash
Run comprehensive verification script
./node_modules/swictation/scripts/verify-installation.shOr check manually
swictation status # Check daemon status
systemctl --user status swictation-daemon.service
journalctl --user -u swictation-daemon -n 50 # View logs
`Expected on Wayland:
- ✓ Wayland detected
- ✓ GNOME/Sway detected (as applicable)
- ✓ GPU libraries installed (if NVIDIA GPU present)
- ✓ ydotool installed
- ✓ uinput module loaded
- ✓ User in 'input' group
- ✓ GNOME keyboard shortcut configured (if GNOME)
- ✓ Service enabled and running
- ✓ IPC socket responding
Troubleshooting
$3
Text not typing after transcription:
`bash
1. Verify ydotool permissions
sg input -c 'ydotool type "test"'2. Check group membership (may need logout/login)
groups | grep input3. Verify uinput permissions
ls -l /dev/uinput
Should show: crw-rw---- 1 root input
4. Re-run setup if needed
./node_modules/swictation/scripts/setup-ydotool.sh
`GNOME: Hotkey not working (types "D" instead):
`bash
Verify shortcut configuration
gsettings get org.gnome.settings-daemon.plugins.media-keys custom-keybindingsRe-configure if needed
./node_modules/swictation/scripts/setup-gnome-shortcuts.shCheck in Settings → Keyboard → View and Customize Shortcuts
Look for "Swictation Toggle"
`Slow transcription (20-30 seconds instead of <1 second):
This means CPU mode instead of GPU. Always start daemon via systemctl, not manually!
`bash
Restart via systemctl (has correct LD_LIBRARY_PATH)
systemctl --user restart swictation-daemon.serviceCheck logs for GPU detection
journalctl --user -u swictation-daemon.service | grep CUDA
Expected: "Successfully registered
CUDAExecutionProvider"
Expected: "cuDNN version: 91501"
`$3
#### "Platform package @agidreams/linux-x64 not found"
Cause: npm's optionalDependencies failed to install the platform package.
Solution:
`bash
Force reinstall with clean cache
npm install -g swictation --forceOr manually install platform package
npm install -g @agidreams/linux-x64
npm install -g swictation
`Verify installation:
`bash
Check that platform package is installed
npm list -g @agidreams/linux-x64 # Linux
npm list -g @agidreams/darwin-arm64 # macOSCheck binary locations
swictation --version
Should show platform package location
`#### "Unsupported platform: win32-x64"
Cause: Swictation currently supports Linux x64 and macOS ARM64 only.
Supported platforms:
- ✅ Linux x86_64 (Ubuntu 24.04+, Debian 13+, Fedora 39+)
- ✅ macOS Apple Silicon (ARM64)
- ❌ Windows (planned for future release)
- ❌ macOS Intel (planned for future release)
- ❌ Linux ARM64 (planned for future release)
#### "Platform package installed but binaries are missing"
Cause: Platform package build was incomplete or corrupted.
Solution:
`bash
Reinstall platform package
npm uninstall -g @agidreams/linux-x64
npm install -g @agidreams/linux-x64 --forceVerify binaries exist
ls -lh ~/.npm-global/lib/node_modules/@agidreams/linux-x64/bin/
Should show: swictation-daemon, swictation-ui
`#### "Wrong architecture - ELF 64-bit instead of Mach-O"
Cause: Wrong platform package installed (Linux package on macOS or vice versa).
Solution:
`bash
Remove wrong platform package
npm uninstall -g @agidreams/linux-x64
npm uninstall -g @agidreams/darwin-arm64Clean install with correct platform detection
npm install -g swictation --force
`$3
#### "Cannot see postinstall output"
Solution: Use the
--foreground-scripts flag:
`bash
sudo npm install -g swictation --foreground-scripts
`#### "GPU libraries download failed (404)"
Cause: GitHub release for this version doesn't exist yet.
Solution: This only happens with unreleased versions. Published npm versions will work.
#### "libasound.so.2: cannot open shared object file"
Cause: Missing ALSA library.
Solution:
`bash
Ubuntu 24.04+
sudo apt-get install libasound2t64Ubuntu 22.04 and older
sudo apt-get install libasound2
`#### "Model test-loading failed during installation"
Cause: GPU not detected or CUDA libraries missing.
Solution: This is non-fatal. Swictation will fall back to CPU mode. To verify GPU setup after installation:
`bash
Check if CUDA is available
nvidia-smiCheck CUDA libraries
ls /usr/local/cuda/lib64/
`$3
`bash
Check status
swictation statusView logs
journalctl --user -u swictation-daemon -f
`$3
`bash
List audio devices
arecord -lTest microphone
arecord -d 5 test.wav && aplay test.wav
`$3
- Wayland: See "Wayland-Specific Issues" section above
- X11: Ensure xdotool is installed
- Check logs: journalctl --user -u swictation-daemon -f$3
Only 1 audio device detected (should see 4+):
The systemd service should automatically import PulseAudio/PipeWire environment variables. Verify the service is running:
`bash
systemctl --user status swictation-daemon.service
`$3
Cause: VAD threshold too low - background noise exceeds threshold.
Solution: Adjust in
~/.config/swictation/config.toml:
`toml
vad_threshold = 0.25 # Default optimized for real-time
vad_min_silence = 0.8 # Seconds of silence before flush
`Lower threshold (0.003) causes continuous speech detection. Optimal is 0.25 for balanced sensitivity.
$3
Always reload systemd and restart the daemon:
`bash
systemctl --user daemon-reload
systemctl --user restart swictation-daemonVerify it started correctly
systemctl --user status swictation-daemon
journalctl --user -u swictation-daemon -n 50
`Platform Support
| Platform | Status | Notes |
|----------|--------|-------|
| Linux Wayland (GNOME) | ✅ Fully Supported | Auto-configured hotkeys, ydotool setup |
| Linux Wayland (Sway) | ✅ Fully Supported | Auto ydotool setup, manual hotkeys |
| Linux X11 | ✅ Supported | Traditional X11 support |
| NVIDIA GPU Acceleration | ✅ Supported | Auto-downloaded CUDA + cuDNN |
| Linux AMD GPU | 🚧 Planned | ROCm support |
| macOS (Apple Silicon/Intel) | 🚧 Planned | CoreML/Metal execution providers |
| Windows + NVIDIA | 🚧 Planned | CUDA support |
| Windows + AMD/Intel | 🚧 Planned | DirectML execution provider |
Development
Source code: https://github.com/yourusername/swictation
$3
`bash
Clone repository
git clone https://github.com/yourusername/swictation
cd swictationBuild Rust components
cd rust-crates
cargo build --releaseBuild Tauri UI
cd ../tauri-ui
npm install
npm run build
``MIT
Contributions are welcome! Please read our Contributing Guide for details.
- Wayland Support Guide - Complete Wayland setup and troubleshooting
- Secretary Mode Guide - Natural language commands reference
- Testing & Validation Guide - Display server testing procedures
- Issues: https://github.com/robertelee78/swictation/issues
- Discussions: https://github.com/robertelee78/swictation/discussions
- Source Code: https://github.com/robertelee78/swictation