Portable, composable code index — build with tree-sitter, query via MCP
npm install codeixcodeix.dev · Portable, composable code index. Your AI agent finds the right function on the first try — no scanning, no guessing, no wasted tokens.
```
codeix # start MCP server, watch for changes
codeix build # parse source files, write .codeindex
codeix serve --no-watch # serve without file watching
AI coding agents spend most of their token budget finding code before they can work on it. They grep, read files, grep again, backtrack. On a large codebase the agent might burn thousands of tokens just locating the right function — or worse, miss it entirely and hallucinate.
Codeix gives the agent a pre-built map of your codebase. One structured query returns the symbol name, file, line range, signature, and parent — no scanning, no guessing.
| Problem | What happens today |
|---|---|
| No structure | grep finds text matches, not symbols. The agent can't distinguish a function definition from a comment mentioning it. |
| Slow re-parsing | Python-based indexers re-parse everything on startup. On large codebases, you wait. |
| Not shareable | Indexes are local caches — ephemeral, per-machine. A new developer or CI runner starts from scratch. |
| No composition | Monorepo with 10 packages? Dependencies with useful APIs? No way to query across boundaries. |
| Prose is invisible | TODOs, docstrings, error messages — searchable by grep but not selectively. You can't search only comments without also matching code. |
- Committed to git — the index is a .codeindex directory you commit with your code. Clone the repo, the index is already there. No re-indexing..codeindex
- Shareable — library authors can ship in their npm/PyPI/crates.io package. Consumers get instant navigation of dependencies.search_texts
- Composable — the MCP server auto-discovers dependency indexes and mounts them. Query your code and your dependencies in one place.
- Structured for LLMs — symbols have kinds, signatures, parent relationships, and line ranges. The agent gets exactly what it needs in one tool call instead of piecing it together from raw text.
- Prose search — targets comments, docstrings, and string literals specifically. Find TODOs, find the error message a user reported, find what a function's docstring says — without noise from code.
- Fast — builds in seconds, queries in milliseconds. Rust + tree-sitter + in-memory SQLite FTS5 under the hood.
formatAn open, portable format for structured code indexing. Plain JSONL files you commit alongside your code — git-friendly diffs, human-readable with grep and jq, no binary blobs.
``
.codeindex/
index.json # manifest: version, name, languages
files.jsonl # one line per source file (path, lang, hash, line count)
symbols.jsonl # one line per symbol (functions, classes, imports, with signatures)
texts.jsonl # one line per comment, docstring, string literal
Any tool that can parse JSON can consume a .codeindex. Codeix builds it using tree-sitter, and AI agents query it through MCP (Model Context Protocol).
Example — symbols.jsonl:`jsonl`
{"file":"src/main.py","name":"os","kind":"import","line":[1,1]}
{"file":"src/main.py","name":"Config","kind":"class","line":[22,45]}
{"file":"src/main.py","name":"Config.__init__","kind":"method","line":[23,30],"parent":"Config","sig":"def __init__(self, path: str, debug: bool = False)"}
{"file":"src/main.py","name":"main","kind":"function","line":[48,60],"sig":"def main(args: list[str]) -> int"}
Include .codeindex in your package and every developer who depends on you gets instant navigation of your API — no setup, no re-indexing.
Works with Git repos, npm, PyPI, and crates.io.
Seven tools, zero setup. The agent queries immediately — no init, no config, no refresh.
| Tool | What it does |
|---|---|
| list_projects | List all indexed projects |search_symbols
| | Fuzzy search across all symbols (FTS5, BM25-ranked), optionally filtered by project |search_files
| | Find files by name, path, or language, optionally filtered by project |search_texts
| | Full-text search on comments, docstrings, strings, optionally filtered by project |get_file_symbols
| | List all symbols in a file |get_symbol_children
| | Get children of a class/module |get_imports
| | List imports for a file |
Launch codeix from any directory. It walks downward and treats every directory containing .git/ as a separate project — each gets its own .codeindex.
Works uniformly for single repos, monorepos, sibling repos, and git submodules. No config needed.
Tree-sitter grammars, feature-gated at compile time:
| Language | Feature flag | Default | Extensions |
|---|---|---|---|
| Python | lang-python | yes | .py .pyi .pyw |lang-rust
| Rust | | yes | .rs |lang-javascript
| JavaScript | | yes | .js .mjs .cjs .jsx |lang-typescript
| TypeScript | | yes | .ts .mts .cts .tsx |lang-go
| Go | | yes | .go |lang-java
| Java | | yes | .java |lang-c
| C | | yes | .c .h |lang-cpp
| C++ | | yes | .cpp .cc .cxx .hpp .hxx |lang-ruby
| Ruby | | yes | .rb .rake .gemspec |lang-csharp
| C# | | yes | .cs |
HTML, Vue, Svelte, and Astro files are preprocessed to extract embedded