Code Intelligence MCP Server - Smart context for your LLM coding agent
npm install @iceinvein/code-intelligence-mcp> Semantic search and code navigation for LLM agents.



---
This server indexes your codebase locally to provide fast, semantic, and structure-aware code navigation to tools like OpenCode, Trae, and Cursor.
Unlike basic text search, this server builds a local knowledge graph to understand your code.
* š Advanced Hybrid Search: Combines Tantivy (keyword BM25) + LanceDB (semantic vector) + Jina Code embeddings (768-dim code-specific model) with Reciprocal Rank Fusion (RRF).
* šÆ Cross-Encoder Reranking: Always-on ORT-based reranker for precision result ranking.
* š§ Smart Context Assembly: Token-aware budgeting with query-aware truncation that keeps relevant lines within context limits.
* š PageRank Scoring: Graph-based symbol importance scoring that identifies central, heavily-used components.
* š Learns from Feedback: Optional learning system that adapts to user selections over time.
* š Production First: Ranking heuristics prioritize implementation code over tests and glue code (index.ts).
* š Multi-Repo Support: Index and search across multiple repositories/monorepos simultaneously.
* ā” Fast & Local: Written in Rust. Uses Metal GPU acceleration on macOS. Parallel indexing with persistent caching.
---
Runs directly via npx without requiring a local Rust toolchain.
Add to your opencode.json (or global config):
``json`
{
"mcp": {
"code-intelligence": {
"type": "local",
"command": ["npx", "-y", "@iceinvein/code-intelligence-mcp"],
"enabled": true
}
}
}
The server will automatically download the AI model (~300MB) and index your project in the background.
---
Available tools for the agent (19 tools total):
| Tool | Description |
| :------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| search_code | Primary Search. Finds code by meaning ("how does auth work?") or structure ("class User"). Supports query decomposition (e.g., "authentication and authorization"). |get_definition
| | Retrieves the full definition of a specific symbol with disambiguation support. |find_references
| | Finds all usages of a function, class, or variable. |get_call_hierarchy
| | Specifies upstream callers and downstream callees. |get_type_graph
| | Explores inheritance (extends/implements) and type aliases. |explore_dependency_graph
| | Explores module-level dependencies upstream or downstream. |get_file_symbols
| | Lists all symbols defined in a specific file. |get_usage_examples
| | Returns real-world examples of how a symbol is used in the codebase. |
| Tool | Description |
| :----------------------- | :---------------------------------------------------------------------------------------- |
| explain_search | Returns detailed scoring breakdown to understand why results ranked as they did. |find_similar_code
| | Finds code semantically similar to a given symbol or code snippet. |trace_data_flow
| | Traces variable reads and writes through the codebase to understand data flow. |find_affected_code
| | Finds code that would be affected if a symbol changes (reverse dependencies). |get_similarity_cluster
| | Returns symbols in the same semantic similarity cluster as a given symbol. |summarize_file
| | Generates a summary of file contents including symbol counts, structure, and key exports. |get_module_summary
| | Lists all exported symbols from a module/file with their signatures. |
| Tool | Description |
| :---------------------- | :------------------------------------------------------------------------------------------ |
| search_todos | Searches for TODO and FIXME comments to track technical debt. |find_tests_for_symbol
| | Finds test files that test a given symbol or source file. |search_decorators
| | Searches for TypeScript/JavaScript decorators (@Component, @Controller, @Get, @Post, etc.). |
| Tool | Description |
| :----------------- | :------------------------------------------------------------------------------ |
| hydrate_symbols | Hydrates full context for a set of symbol IDs. |report_selection
| | Records user selection feedback for learning (call when user selects a result). |refresh_index
| | Manually triggers a re-index of the codebase. |get_index_stats
| | Returns index statistics (files, symbols, edges, last updated). |
---
The server supports semantic navigation and symbol extraction for the following languages:
* Rust
* TypeScript / TSX
* JavaScript
* Python
* Go
* Java
* C
* C++
---
The ranking engine optimizes results for relevance using sophisticated signals:
1. PageRank Symbol Importance: Graph-based scoring that identifies central, heavily-used components (similar to Google's PageRank).
2. Cross-Encoder Reranking: Always-on ORT-based reranker applies deep learning to fine-tune result order.
3. Reciprocal Rank Fusion (RRF): Combines keyword, vector, and graph search results using statistically optimal rank fusion.
4. Query Decomposition: Complex queries ("X and Y") are automatically split into sub-queries for better coverage.
5. Token-Aware Truncation: Context assembly keeps query-relevant lines within token budgets using BM25-style relevance scoring.
6. Directory Semantics: Implementation directories (src, lib, app) are boosted, while build artifacts (dist, build) and node_modules are penalized.*.test.ts
7. Test Penalty: Test files (, __tests__) are ranked lower by default, but are boosted if the query intent implies testing.index.ts
8. Glue Code Filtering: Re-export files (e.g., ) are deprioritized in favor of the actual implementation.
9. JSDoc Boost: Symbols with documentation receive a ranking boost, and examples are included in search results.
10. Learning from Feedback (optional): Tracks user selections to personalize future search results.
11. Package-Aware Scoring (multi-repo): Boosts results from the same package when working in monorepos.
The system detects query intent and adjusts ranking accordingly:
| Query Pattern | Intent | Effect |
| ----------------- | ------------------------- | --------------------------------------- |
| "struct User" | Definition | Boosts type definitions (1.5x) |
| "who calls login" | Callers | Triggers graph lookup |
| "verify login" | Testing | Boosts test files |
| "User schema" | Schema/Model | Boosts schema/model files (50-75x) |
| "auth and authz" | Multi-query decomposition | Splits into sub-queries, merges via RRF |
For a deep dive into the system's design, see System Architecture.
---
Works without configuration by default. You can customize behavior via environment variables:
`json`
"env": {
"BASE_DIR": "/path/to/repo", // Required: Repository root
"WATCH_MODE": "true", // Watch for file changes (Default: true)
"INDEX_PATTERNS": "/.ts,/.go", // File patterns to index
"EXCLUDE_PATTERNS": "/node_modules/",
"REPO_ROOTS": "/path/to/repo1,/path/to/repo2" // Multi-repo support
}
`json`
"env": {
"EMBEDDINGS_BACKEND": "jinacode", // jinacode (default), fastembed, hash
"EMBEDDINGS_DEVICE": "cpu", // cpu or metal (macOS GPU)
"EMBEDDING_BATCH_SIZE": "32"
}
`json`
"env": {
"MAX_CONTEXT_TOKENS": "8192", // Token budget for context (default: 8192)
"TOKEN_ENCODING": "o200k_base", // tiktoken encoding model
"MAX_CONTEXT_BYTES": "200000" // Legacy byte-based limit (fallback)
}
`json`
"env": {
"RANK_EXPORTED_BOOST": "0.1", // Boost for exported symbols
"RANK_TEST_PENALTY": "0.1", // Penalty for test files
"RANK_POPULARITY_WEIGHT": "0.05", // PageRank influence
"RRF_ENABLED": "true", // Enable Reciprocal Rank Fusion
"HYBRID_ALPHA": "0.7" // Vector vs keyword weight (0-1)
}
`json`
"env": {
"LEARNING_ENABLED": "false", // Enable selection tracking (default: false)
"LEARNING_SELECTION_BOOST": "0.1", // Boost for previously selected symbols
"LEARNING_FILE_AFFINITY_BOOST": "0.05" // Boost for frequently accessed files
}
`json`
"env": {
"PARALLEL_WORKERS": "1", // Indexing parallelism (default: 1 for SQLite)
"EMBEDDING_CACHE_ENABLED": "true", // Persistent embedding cache
"PAGERANK_ITERATIONS": "20", // PageRank computation iterations
"METRICS_ENABLED": "true", // Prometheus metrics
"METRICS_PORT": "9090"
}
`json`
"env": {
"SYNONYM_EXPANSION_ENABLED": "true", // Expand "auth" ā "authentication"
"ACRONYM_EXPANSION_ENABLED": "true" // Expand "db" ā "database"
}
---
`mermaid
flowchart LR
Client[MCP Client] <==> Tools
subgraph Server [Code Intelligence Server]
direction TB
Tools[Tool Router]
subgraph Indexer [Indexing Pipeline]
direction TB
Scan[File Scan] --> Parse[Tree-Sitter]
Parse --> Extract[Symbol Extraction]
Extract --> PageRank[PageRank Compute]
Extract --> Embed[Jina Code Embeddings]
Extract --> JSDoc[JSDoc/Decorator/TODO Extract]
end
subgraph Storage [Storage Engine]
direction TB
SQLite[(SQLite)]
Tantivy[(Tantivy)]
Lance[(LanceDB)]
Cache[(Embedding Cache)]
end
subgraph Retrieval [Retrieval Engine]
direction TB
QueryExpand[Query Expansion]
Hybrid[Hybrid Search RRF]
Rerank[Cross-Encoder Reranker]
Signals[Ranking Signals]
Context[Token-Aware Assembly]
end
%% Data Flow
Tools -- Index --> Scan
PageRank --> SQLite
Embed --> Lance
Embed --> Cache
JSDoc --> SQLite
Tools -- Query --> QueryExpand
QueryExpand --> Hybrid
Hybrid --> Rerank
Rerank --> Signals
Signals --> Context
Context --> Tools
end
`
---
1. Prerequisites: Rust (stable), protobuf.cargo build --release
2. Build: ./scripts/start_mcp.sh
3. Run: cargo test
4. Test: or EMBEDDINGS_BACKEND=hash cargo test (faster, skips model download)
For faster development iteration, use the hash embedding backend which skips model downloads:
`bash`
EMBEDDINGS_BACKEND=hash BASE_DIR=/path/to/repo ./target/release/code-intelligence-mcp-server
```
src/
āāā indexer/ # File scanning, parsing, symbol extraction
āāā storage/ # SQLite, Tantivy, LanceDB layers
āāā retrieval/ # Hybrid search, ranking, context assembly
āāā graph/ # PageRank, call hierarchy, type graphs
āāā handlers/ # MCP tool handlers
āāā server/ # MCP protocol routing
āāā tools/ # Tool definitions
āāā embeddings/ # Jina Code model wrapper
āāā reranker/ # Cross-encoder ORT implementation
āāā metrics/ # Prometheus metrics
āāā config.rs # Environment-based configuration
MIT