A local semantic grep tool using Qwen3 embeddings via Ollama and LanceDB
npm install wb-grep



A fully local semantic grep tool for code search using Qwen3 embeddings via Ollama and LanceDB.
wb-grep brings the power of semantic code search to your local machine without requiring any cloud services or API keys. Search your codebase using natural language queries, find related code by meaning rather than exact text matches, and keep your code indexed automatically as you work.
Install now: npm install -g wb-grep
---
> Note: wb-grep is a derivative work based on mgrep
> by Mixedbread AI, licensed under Apache-2.0. This project replaces
> the cloud-based embedding and vector storage with local alternatives (Ollama + LanceDB)
> while preserving the core CLI design and architecture.
---
Traditional grep is an invaluable tool, but it requires you to know exactly what you're looking for. When exploring unfamiliar codebases, debugging complex issues, or trying to understand how features are implemented, you often need to search by intent rather than exact patterns.
wb-grep solves this by:
- Understanding meaning: Search for "authentication logic" and find the actual auth implementation, even if it's called verifyCredentials or checkUserSession
- Running 100% locally: All embeddings and vector storage happen on your machine using Ollama and LanceDB—no cloud services, no API costs, no data leaving your system
- Staying up-to-date: Watch mode automatically re-indexes files as you edit them
- Being agent-friendly: Designed to work seamlessly with coding agents, providing quiet output and thoughtful defaults
wb-grep uses a three-stage pipeline:
1. Chunking: Source files are intelligently split into semantic chunks (functions, classes, logical blocks) that preserve context
2. Embedding: Each chunk is converted into a 1024-dimensional vector using the Qwen3-Embedding-0.6B model running locally via Ollama
3. Vector Search: Queries are embedded the same way, then LanceDB finds the most similar code chunks using approximate nearest neighbor search
The result is a search experience that understands what you mean, not just what you type.
---
| Feature | grep/ripgrep | Cloud Semantic Search | wb-grep |
|---------|--------------|----------------------|-------------|
| Exact pattern matching | ✅ | ✅ | ✅ |
| Natural language queries | ❌ | ✅ | ✅ |
| Works offline | ✅ | ❌ | ✅ |
| No API costs | ✅ | ❌ | ✅ |
| Data stays local | ✅ | ❌ | ✅ |
| Automatic re-indexing | ❌ | ✅ | ✅ |
| AST-aware chunking | ❌ | ✅ | ✅ |
Use grep for: exact symbol tracing, regex patterns, refactoring known identifiers
Use wb-grep for: code exploration, feature discovery, understanding unfamiliar codebases, natural language queries
---
1. Node.js 18+ (for running wb-grep)
2. Ollama (for local embeddings)
Recommended: Install from NPM (one-liner)
``bash`
npm install -g wb-grep
Alternative: Install from source (for development)
`bash`
git clone https://github.com/wb200/wb-grep.git
cd wb-grep
npm install
npm run build
npm link # Makes 'wb-grep' available globally
Verify installation:
`bash`
wb-grep --version
`bashInstall Ollama (macOS)
brew install ollama
Verify Ollama is running:
`bash
curl http://localhost:11434/api/tags
`$3
`bash
Navigate to any codebase
cd /path/to/your/projectIndex the repository (runs once, then watches for changes)
wb-grep watchSearch using natural language
wb-grep "where is authentication handled"
wb-grep "database connection setup"
wb-grep "error handling patterns"
`---
Factory Droid Integration
wb-grep integrates seamlessly with Factory Droid to provide semantic code search capabilities within your AI coding workflows.
$3
`bash
Run the install command to set up wb-grep for Droid
wb-grep install-droid
`This command:
- Verifies Ollama connectivity and embedding model availability
- Checks if your repository is indexed
- Installs wb-grep as a Droid plugin with hooks and skills
- Enables automatic watch mode when Droid sessions start
$3
Once installed, wb-grep integrates with Droid via:
1. Hooks - Automatically starts/stops
wb-grep watch at session boundaries:
- SessionStart: Initializes wb-grep watch in background
- SessionEnd: Cleanly terminates the watch process2. Skills - Two complementary capabilities:
- wb-grep skill: Quick reference for using semantic search
- advanced-grep skill: Comprehensive decision framework for choosing the right search tool (wb-grep, Grep, or ast-grep)
$3
Within a Droid session, you can leverage semantic search in several ways:
`bash
Droid can invoke wb-grep directly
droid> Search for the authentication middlewareOr use the advanced-grep skill for optimal search strategy
droid> Where should I look for rate limiting?
→ advanced-grep skill recommends: wb-grep "rate limiting implementation"
Or run semantic queries via Execute tool
droid> /exec wb-grep "session management"
`$3
`
~/.factory/plugins/wb-grep/
├── hooks/
│ ├── hook.json # Hook configuration
│ ├── wb_grep_watch.py # Start watch process
│ └── wb_grep_watch_kill.py # Clean shutdown
├── skills/
│ └── wb-grep/
│ └── SKILL.md # Quick reference skill
└── plugin.json # Plugin metadata
`$3
- wb-grep binary installed globally (via
npm link)
- Ollama running and accessible at http://localhost:11434
- Embedding model available: qwen3-embedding:0.6b
- Factory Droid CLI installed$3
"Plugin not found"
`bash
Verify installation
wb-grep install-droid --verifyReinstall if needed
wb-grep install-droid --force
`"Hooks failing"
`bash
Check hook logs
cat ~/.factory/hooks/debug.log | grep wb-grepVerify Ollama connectivity
curl http://localhost:11434/api/tags
`"Watch mode not starting"
`bash
Test hook manually
python3 ~/.factory/plugins/wb-grep/hooks/wb_grep_watch.pyCheck if already running
pgrep -f "wb-grep watch"
`---
Commands
$3
Search for code using natural language queries. This is the default command—you can omit
search.`bash
Basic search
wb-grep "function that validates user input"Search with path filter
wb-grep "API endpoints" src/routesShow more results
wb-grep -m 20 "logging configuration"Include code snippets in output
wb-grep -c "authentication middleware"
`| Option | Description | Default |
|--------|-------------|---------|
|
-m, --max-count | Maximum number of results | 10 |
| -c, --content | Show code snippets in results | false |Output Format:
`
./src/lib/auth.ts:45-67 (85.2%)
./src/middleware/session.ts:12-28 (73.8%)
./src/utils/jwt.ts:5-22 (68.4%)
`The percentage indicates semantic similarity—higher means more relevant.
---
$3
Index the repository and keep it up-to-date as files change.
`bash
Start watching (indexes first, then monitors changes)
wb-grep watchDry run—show what would be indexed without actually indexing
wb-grep watch --dry-run
`| Option | Description |
|--------|-------------|
|
-d, --dry-run | Preview files without indexing |What gets indexed:
- Source code files (
.ts, .js, .py, .go, .rs, .java, etc.)
- Documentation (.md, .mdx, .txt)
- Configuration files (.json, .yaml, .toml, .xml)
- Shell scripts (.sh, .bash, .zsh)
- And 50+ other file typesWhat gets ignored:
-
.gitignore patterns are respected
- .wbgrepignore for additional exclusions
- Binary files, lock files, build outputs
- node_modules, .git, dist, build directories---
$3
One-shot indexing without file watching. Useful for CI/CD or when you don't need continuous updates.
`bash
Index current directory
wb-grep indexIndex a specific path
wb-grep index --path /path/to/projectClear existing index and rebuild from scratch
wb-grep index --clear
`| Option | Description |
|--------|-------------|
|
-c, --clear | Clear existing index before indexing |
| -p, --path | Path to index (defaults to cwd) |---
$3
Show index statistics and system status.
`bash
Basic status
wb-grep statusDetailed status with file list
wb-grep status --verbose
`| Option | Description |
|--------|-------------|
|
-v, --verbose | Show detailed information including indexed files |Example Output:
`
📊 wb-grep StatusIndex Statistics:
Files indexed: 142
Total chunks: 1,847
Last sync: 2024-01-15T10:32:45.000Z
Vector Store:
Unique files: 142
Total vectors: 1,847
Ollama Status:
Connected: yes
Model available: yes
Model: qwen3-embedding:0.6b
URL: http://localhost:11434
`---
$3
Remove all indexed data and start fresh.
`bash
Show warning (requires --force to actually clear)
wb-grep clearActually clear the index
wb-grep clear --force
`| Option | Description |
|--------|-------------|
|
-f, --force | Required to confirm deletion |---
Configuration
wb-grep can be configured via configuration files or environment variables.
$3
Create one of these files in your project root:
-
.wbgreprc
- .wbgreprc.json
- wbgrep.config.jsonExample
.wbgreprc.json:
`json
{
"ollama": {
"baseURL": "http://localhost:11434",
"model": "qwen3-embedding:0.6b",
"timeout": 30000,
"retries": 3
},
"indexing": {
"batchSize": 10,
"maxFileSize": 1048576,
"concurrency": 8
},
"search": {
"maxResults": 10,
"showContent": false
},
"ignore": [
"*.generated.ts",
"vendor/**"
]
}
`$3
| Variable | Description | Default |
|----------|-------------|---------|
|
WBGREP_OLLAMA_URL | Ollama server URL | http://localhost:11434 |
| WBGREP_OLLAMA_MODEL | Embedding model name | qwen3-embedding:0.6b |
| WBGREP_OLLAMA_TIMEOUT | Request timeout (ms) | 30000 |
| WBGREP_OLLAMA_RETRIES | Number of retries | 3 |
| WBGREP_MAX_COUNT | Default max results | 10 |
| WBGREP_CONTENT | Show content by default | false |
| WBGREP_BATCH_SIZE | Indexing batch size | 10 |
| WBGREP_CONCURRENCY | Embedding concurrency | 8 |
| WBGREP_LOG_LEVEL | Log level (debug/info/warn/error) | info |Example:
`bash
export WBGREP_MAX_COUNT=25
export WBGREP_CONTENT=true
wb-grep "authentication"
`$3
Create a
.wbgrepignore file in your project root to exclude additional files:`gitignore
Exclude generated files
*.generated.ts
*.g.dartExclude specific directories
legacy/**
experiments/**Exclude large data files
*.csv
*.parquet
`The syntax follows
.gitignore conventions.---
Examples
$3
`bash
Get an overview of the architecture
wb-grep "main entry point"
wb-grep "application initialization"
wb-grep "routing configuration"Find specific functionality
wb-grep "user authentication flow"
wb-grep "database migrations"
wb-grep "API rate limiting"Understand patterns
wb-grep "error handling patterns"
wb-grep "logging implementation"
wb-grep "dependency injection"
`$3
`bash
Find error-related code
wb-grep "where errors are thrown"
wb-grep "exception handling for network requests"Trace data flow
wb-grep "where user data is saved"
wb-grep "session storage implementation"
`$3
`bash
Find security-sensitive code
wb-grep "password hashing"
wb-grep "SQL query construction"
wb-grep "file upload handling"Check for patterns
wb-grep "deprecated API usage"
wb-grep "TODO comments about security"
`$3
`bash
Search only in specific directories
wb-grep "validation logic" src/validators
wb-grep "React hooks" src/components
wb-grep "test utilities" tests/Search across multiple areas
wb-grep "configuration parsing" src/config
`$3
`bash
Get code snippets with results
wb-grep -c "middleware chain"Output:
./src/middleware/index.ts:15-32 (89.3%)
export function createMiddlewareChain(middlewares: Middleware[]) {
return async (ctx: Context, next: NextFunction) => {
let index = 0;
const dispatch = async (i: number): Promise => {
if (i <= index) throw new Error('next() called multiple times');
index = i;
const fn = middlewares[i];
if (!fn) return next();
await fn(ctx, () => dispatch(i + 1));
};
return dispatch(0);
};
}
`---
Technical Details
$3
wb-grep uses Qwen3-Embedding-0.6B, a compact but powerful embedding model:
- Dimensions: 1024
- Context Length: 32K tokens
- Size: ~600MB
- Languages: Multilingual support
The model runs locally via Ollama, ensuring your code never leaves your machine.
$3
LanceDB provides the vector database:
- Embedded database (no server required)
- Fast approximate nearest neighbor search
- Efficient storage with columnar format
- Supports millions of vectors
Index data is stored in
.wb-grep/ in your project root.$3
Files are intelligently split into chunks that:
- Preserve function/class boundaries where possible
- Keep related code together
- Respect a maximum chunk size (~2000 characters)
- Include context (imports, surrounding code)
$3
`
your-project/
├── .wb-grep/
│ ├── vectors/ # LanceDB vector store
│ └── state.json # Index metadata
├── .wbgrepignore # Custom ignore patterns
└── .wbgreprc.json # Configuration (optional)
`---
Troubleshooting
$3
`bash
Make sure Ollama is running
ollama serveCheck if it's accessible
curl http://localhost:11434/api/tags
`$3
`bash
Pull the embedding model
ollama pull qwen3-embedding:0.6bVerify it's installed
ollama list
`$3
`bash
Check if files are indexed
wb-grep statusIf no files indexed, run watch or index
wb-grep watch
`$3
`bash
Rebuild the index from scratch
wb-grep index --clear
`$3
`bash
Reduce concurrency if Ollama is overwhelmed
export WBGREP_CONCURRENCY=4Or increase timeout for slow systems
export WBGREP_OLLAMA_TIMEOUT=60000
`---
Comparison with mgrep
wb-grep is inspired by mgrep, a cloud-based semantic grep tool by Mixedbread. The key differences:
| Feature | mgrep | wb-grep |
|---------|-------|---------|
| Embedding Provider | Mixedbread Cloud | Local Ollama |
| Vector Storage | Mixedbread Cloud | Local LanceDB |
| Authentication | Required | None |
| API Costs | Pay per use | Free |
| Data Privacy | Cloud-based | 100% local |
| Model | Mixedbread proprietary | Qwen3-Embedding-0.6B |
| Multimodal | Images, PDFs | Code/text only |
Choose mgrep if you want cloud convenience, multimodal search, and don't mind API costs.
Choose wb-grep if you need fully local operation, data privacy, or want to avoid recurring costs.
---
Contributing & Development
$3
Install from npm:
`bash
npm install -g wb-grep
`Or clone from source:
`bash
git clone https://github.com/wb200/wb-grep.git
cd wb-grep
npm install
npm link
`$3
`bash
Install dependencies
npm installBuild
npm run buildDevelopment (watch mode)
npm run devLint and format
npm run lint
npm run formatType check
npm run typecheckRun tests (if configured)
npm test
`$3
`bash
Update version
npm version patch # or minor/majorPublish to npm
npm publishPush tags to GitHub
git push origin --tags
`$3
`
wb-grep/
├── src/
│ ├── index.ts # CLI entry point
│ ├── commands/
│ │ ├── search.ts # Search command
│ │ ├── watch.ts # Watch command
│ │ ├── index-cmd.ts # Index command
│ │ ├── status.ts # Status command
│ │ └── clear.ts # Clear command
│ └── lib/
│ ├── embeddings.ts # Ollama embedding client
│ ├── vector-store.ts # LanceDB wrapper
│ ├── chunker.ts # Code chunking logic
│ ├── indexer.ts # Indexing orchestration
│ ├── index-state.ts # State management
│ ├── file.ts # File system utilities
│ ├── config.ts # Configuration loading
│ ├── constants.ts # Shared constants
│ └── logger.ts # Logging utilities
├── dist/ # Compiled output
├── package.json
├── tsconfig.json
└── biome.json
``---
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
wb-grep is a derivative work of mgrep by Mixedbread AI,
which is also licensed under Apache-2.0.
---
wb-grep is based on mgrep by Mixedbread AI.
The original mgrep provides cloud-based semantic code search using Mixedbread's embedding API.
This derivative work adapts the core architecture for fully local operation.
- Original Project: mixedbread-ai/mgrep
- Original License: Apache-2.0
- Original Authors: Mixedbread AI team
- Ollama - Local LLM inference
- LanceDB - Embedded vector database
- Qwen - Embedding model