DeepMatch MCP

A Model Context Protocol (MCP) server for semantic code search using vector embeddings. Index your codebase and search with natural language queries.

Features

- Semantic Code Search: Find code by meaning, not just keywords
- Multiple Embedding Providers: OpenAI, Ollama, Gemini, OpenAI-compatible APIs
- Real-time File Watching: Automatically re-index on file changes
- Multi-repository Support: Index multiple directories simultaneously
- Smart Filtering: Respects .gitignore, skips binary files and common build directories
- MCP Protocol: Works with any MCP-compatible client (Claude Desktop, etc.)

Architecture

``┌─────────────────────────────────────────────────────────────────┐ │ deepmatch-mcp │ ├─────────────────────────────────────────────────────────────────┤ │ CLI Entrypoint (src/cli.ts) │ │ - Parses config from CLI flags and environment variables │ │ - Orchestrates startup: scan → index → watch → serve │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ Config │ │ Providers │ │ Vector Store │ │ │ │ (Zod) │ │ (Embedders) │ │ (Qdrant) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ Scanner │ │ Chunker │ │ Index Manager │ │ │ │ (Directory) │ │ (Line-based)│ │ (Batch Processing) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ │ │ │ ┌─────────────┐ ┌───────────────────────────────────────────┐ │ │ │ Watcher │ │ MCP Server │ │ │ │ (Chokidar) │ │ (stdio transport, 'search' tool) │ │ │ └─────────────┘ └───────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘`

`$3`

| Module | Path | Description | |--------|------|-------------| | Config |src/config/| CLI/ENV parsing with Zod validation | | Providers |src/providers/| Embedding providers (OpenAI, Ollama, Gemini, OpenAI-compatible) | | Store |src/store/| Qdrant vector database wrapper | | Chunker |src/chunker/| Line-based text chunking with configurable limits | | Scanner |src/scanner/| Directory traversal with .gitignore support | | Indexer |src/indexer/| Batch embedding and vector upsert orchestration | | Watcher |src/watcher/| File change detection with debouncing | | MCP |src/mcp/ | MCP stdio server with search tool |

`Installation`

`bash

`Install dependencies`


npm install
Build

npm run build


Usage
$3

1. Qdrant - Vector database (default: http://localhost:6333)

`bash # Using Docker docker run -p 6333:6333 qdrant/qdrant`

2. Embedding Provider - One of: - OpenAI API key - Ollama running locally - Gemini API key - Any OpenAI-compatible API

`$3`

`bash npx deepmatch-mcp [options]

Options: --path Repository path to index (repeatable) --provider Embedding provider: openai|ollama|gemini|openai-compatible --model Embedding model name --embedding-dim Embedding dimension (auto-detected if not set) --batch-size Batch size for embeddings (default: 60) --max-files Maximum files to index (default: 50000) --qdrant-url Qdrant server URL (default: http://localhost:6333) --qdrant-key Qdrant API key --openai-key OpenAI API key --ollama-url Ollama server URL --gemini-key Gemini API key --openai-compat-base-url OpenAI-compatible base URL --openai-compat-key OpenAI-compatible API key`

`$3`

All CLI options can be set via environment variables:

| Variable | CLI Flag | |----------|----------| |DEEPMATCH_PATHS | --path(comma-separated) | |DEEPMATCH_PROVIDER | --provider| |DEEPMATCH_MODEL | --model| |DEEPMATCH_EMBEDDING_DIM | --embedding-dim| |DEEPMATCH_BATCH_SIZE | --batch-size| |DEEPMATCH_MAX_FILES | --max-files| |DEEPMATCH_QDRANT_URL | --qdrant-url| |DEEPMATCH_QDRANT_API_KEY | --qdrant-key| |DEEPMATCH_OPENAI_API_KEY | --openai-key| |DEEPMATCH_OLLAMA_URL | --ollama-url| |DEEPMATCH_GEMINI_API_KEY | --gemini-key| |DEEPMATCH_OPENAI_COMPAT_BASE_URL | --openai-compat-base-url| |DEEPMATCH_OPENAI_COMPAT_API_KEY | --openai-compat-key |

CLI flags take precedence over environment variables.

`$3`

With OpenAI:

`bash npx deepmatch-mcp \ --path /path/to/your/repo \ --provider openai \ --openai-key sk-xxx`

With Ollama:

`bash npx deepmatch-mcp \ --path /path/to/repo1 \ --path /path/to/repo2 \ --provider ollama \ --ollama-url http://localhost:11434 \ --model nomic-embed-text`

With environment variables:

`bash export DEEPMATCH_PATHS="/path/to/repo" export DEEPMATCH_PROVIDER="openai" export DEEPMATCH_OPENAI_API_KEY="sk-xxx" npx deepmatch-mcp`

`$3`

For Claude Desktop, add to your MCP settings:

`json { "mcpServers": { "deepmatch": { "command": "npx", "args": ["deepmatch-mcp", "--path", "/path/to/repo", "--provider", "openai"], "env": { "DEEPMATCH_OPENAI_API_KEY": "sk-xxx" } } } }`

`MCP Tools`

`$3`

Search for code using semantic similarity.

Input Schema:

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| |query| string | Yes | Natural language search query | |limit| number | No | Max results (1-50, default: 10) | |paths| string[] | No | Filter to specific repository paths | |minScore | number | No | Minimum similarity score (0-1) |

Output:

`json { "total_count": 5, "items": [ { "filePath": "/repo/src/auth.ts", "repoPath": "/repo", "startLine": 10, "endLine": 25, "codeChunk": "function authenticate(token: string) {...}", "score": 0.92 } ] }`

`Local Development`

`$3`

`bash

`Clone and install`


git clone https://github.com/657KB/deepmatch-mcp
cd deep-match
npm install

$3

`bash

`Run tests (TDD)`


npm test
Run tests in watch mode

npx vitest
Build TypeScript

npm run build
Test the CLI

node dist/cli.js --help

$3

`src/ ├── cli.ts # Main entry point ├── config/ │ ├── schema.ts # Zod schemas and defaults │ ├── index.ts # CLI/ENV parsing │ └── config.test.ts ├── providers/ │ ├── types.ts # IEmbedder interface │ ├── embedders.ts # Provider implementations │ ├── index.ts │ └── embedders.test.ts ├── store/ │ ├── types.ts # IVectorStore interface │ ├── qdrant.ts # Qdrant implementation │ ├── index.ts │ └── qdrant.test.ts ├── chunker/ │ ├── extensions.ts # Supported file extensions │ ├── chunker.ts # Line-based chunking │ ├── index.ts │ └── chunker.test.ts ├── scanner/ │ ├── scanner.ts # Directory traversal │ ├── index.ts │ └── scanner.test.ts ├── indexer/ │ ├── index-manager.ts # Batch indexing orchestration │ ├── index.ts │ └── index-manager.test.ts ├── watcher/ │ ├── file-watcher.ts # Chokidar file watching │ ├── index.ts │ └── file-watcher.test.ts └── mcp/ ├── server.ts # MCP server + search tool ├── index.ts └── server.test.ts`

`$3`

`bash

`Run all tests`


npm test
Run specific test file

npx vitest src/chunker/chunker.test.ts
Run with coverage

npx vitest --coverage

$3

| Parameter | Default | Description | |-----------|---------|-------------| |batchSize| 60 | Embedding batch size | |maxFiles| 50,000 | Maximum files to index | |chunkMin| 50 | Minimum chunk size (chars) | |chunkMax| 1,000 | Maximum chunk size (chars) | |chunkMaxTolerance| 1.15 | Tolerance factor for max size | |chunkRebalanceMin| 200 | Minimum remainder to trigger rebalance | |qdrantUrl | http://localhost:6333 | Qdrant server URL |

`File Filtering`

`$3`

TypeScript, JavaScript, Python, Java, C/C++, C#, Go, Rust, Ruby, PHP, Swift, Kotlin, Scala, Lua, R, Perl, Shell, SQL, HTML, CSS, JSON, YAML, XML, Markdown, Vue, Svelte

`$3`

node_modules, dist, build, target, .git, hidden directories, __pycache__, venv, .next, .nuxt, coverage, vendor, Pods, .gradle, .idea, .vscode

`$3`

- Files larger than 1MB are skipped -.gitignore` rules are respected (stacked for nested directories)

License

MIT

DeepMatch MCP

A Model Context Protocol (MCP) server for semantic code search using vector embeddings. Index your codebase and search with natural language queries.

Features

Architecture

`$3`

`Installation`

`bash

`Install dependencies`


npm install
Build

npm run build


Usage
$3

1. Qdrant - Vector database (default: http://localhost:6333)

`bash # Using Docker docker run -p 6333:6333 qdrant/qdrant`

2. Embedding Provider - One of: - OpenAI API key - Ollama running locally - Gemini API key - Any OpenAI-compatible API

`$3`

`bash npx deepmatch-mcp [options]

`$3`

All CLI options can be set via environment variables:

CLI flags take precedence over environment variables.

`$3`

With OpenAI:

`bash npx deepmatch-mcp \ --path /path/to/your/repo \ --provider openai \ --openai-key sk-xxx`

With Ollama:

`bash npx deepmatch-mcp \ --path /path/to/repo1 \ --path /path/to/repo2 \ --provider ollama \ --ollama-url http://localhost:11434 \ --model nomic-embed-text`

With environment variables:

`bash export DEEPMATCH_PATHS="/path/to/repo" export DEEPMATCH_PROVIDER="openai" export DEEPMATCH_OPENAI_API_KEY="sk-xxx" npx deepmatch-mcp`

`$3`

For Claude Desktop, add to your MCP settings:

`json { "mcpServers": { "deepmatch": { "command": "npx", "args": ["deepmatch-mcp", "--path", "/path/to/repo", "--provider", "openai"], "env": { "DEEPMATCH_OPENAI_API_KEY": "sk-xxx" } } } }`

`MCP Tools`

`$3`

Search for code using semantic similarity.

Input Schema:

Output:

`Local Development`

`$3`

`bash

`Clone and install`


git clone https://github.com/657KB/deepmatch-mcp
cd deep-match
npm install

$3

`bash

`Run tests (TDD)`


npm test
Run tests in watch mode

npx vitest
Build TypeScript

npm run build
Test the CLI

node dist/cli.js --help

$3

`$3`

`bash

`Run all tests`


npm test
Run specific test file

npx vitest src/chunker/chunker.test.ts
Run with coverage

npx vitest --coverage

$3

`File Filtering`

`$3`

TypeScript, JavaScript, Python, Java, C/C++, C#, Go, Rust, Ruby, PHP, Swift, Kotlin, Scala, Lua, R, Perl, Shell, SQL, HTML, CSS, JSON, YAML, XML, Markdown, Vue, Svelte

`$3`

node_modules, dist, build, target, .git, hidden directories, __pycache__, venv, .next, .nuxt, coverage, vendor, Pods, .gradle, .idea, .vscode

`$3`

- Files larger than 1MB are skipped -.gitignore` rules are respected (stacked for nested directories)

License

MIT