AI-powered PR duplicate detection system using embeddings, bloom filters, and multi-signal scoring
npm install prsense
┌─────────────┐
│ New PR │
└──────┬──────┘
│
▼
┌─────────────────┐
│ Bloom Filter │ ──── Early rejection
│ (Fast Path) │
└──────┬──────────┘
│ might contain
▼
┌─────────────────┐
│ Embedding │ ──── Text + Diff embeddings
│ Pipeline │
└──────┬──────────┘
│
▼
┌─────────────────┐
│ ANN Search │ ──── Candidate retrieval (k=20)
│ (Retriever) │
└──────┬──────────┘
│
▼
┌─────────────────┐
│ Multi-Signal │ ──── Text + Diff + File Overlap
│ Ranker │ (Configurable weights)
└──────┬──────────┘
│
▼
┌─────────────────┐
│ Decision │ ──── HIGH (≥0.9), MEDIUM (≥0.82), LOW
│ Engine │
└──────┬──────────┘
`
---
Project Structure
`
prsenses-labs/
├── package.json # Monorepo Root
├── PRSenses/ # Core Library & API
│ ├── src/ # Detection Logic
│ └── action/ # GitHub Action
├── prsense-vscode/ # VS Code Extension
└── prsense-analytics/ # Analytics Dashboard
`
---
Quick Start
$3
`bash
git clone https://github.com/prsense-labs/prsense
cd prsense
npm install # Installs dependencies
npm run build # Builds all packages
`
> [!TIP]
> Zero setup mode: After building, the CLI works immediately using local ONNX embeddings. No API key required.
$3
1. GitHub Action (CI/CD)
`yaml
- uses: prsense-labs/prsense@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
embedding-provider: 'onnx'
`
2. VS Code Extension (Local Dev)
- Detects duplicates while you type.
- Extension Guide
3. CLI Tool (Manual Check)
`bash
npm run cli check my-pr.json
`
4. GitHub Bot (Webhook)
- Deploy to Vercel in 2 minutes.
- Deployment Guide
5. Library (NodeJS Integration)
`typescript
import { PRSenseDetector } from 'prsense'
const result = await detector.check(prData)
`
6. Microservice (Docker/API)
- Run as a standalone REST API.
- Production Setup
---
Usage Example
$3
`typescript
import { PRSenseDetector, createPostgresStorage } from './prsense.js'
import { createOpenAIEmbedder } from './embedders/openai.js'
// 1. Initialize
const storage = await createPostgresStorage().init()
const detector = new PRSenseDetector({
embedder: createOpenAIEmbedder(),
storage,
enableCache: true
})
// 2. Check for duplicates
const result = await detector.checkDetailed({
prId: 123,
title: 'Fix login bug',
description: 'Handle empty passwords',
files: ['auth/login.ts']
})
// 3. Handle Result
if (result.type === 'DUPLICATE') {
console.log(Duplicate of PR #${result.originalPr})
console.log(Confidence: ${result.confidence})
console.log(Breakdown:, result.breakdown)
// Breakdown: { textSimilarity: 0.95, diffSimilarity: 0.88, ... }
}
``