Semantic duplicate pattern detection for AI-generated code - finds similar implementations that waste AI context tokens
npm install @aiready/pattern-detect> Semantic duplicate pattern detection for AI-generated code
Finds semantically similar but syntactically different code patterns that waste AI context and confuse models.
```
šÆ USER
ā
ā¼
šļø CLI (orchestrator)
ā
ā¼
š¢ HUB (core)
ā
āāāāāāāāāāāāāāāāāāāāāāāā¼āāāāāāāāāāāā¬āāāāāāāāāāāā
ā¼ ā¼ ā¼ ā¼
āāāāāāāāāāāāāāā š¦ CONTEXT š§ CONSIST š DOC
ā š PATTERN ā ā¬
YOU ARE HERE ENCY DRIFT
ā DETECT ā ANALYZER
ā ā
Ready ā ā
Ready ā
Ready š Soon
āāāāāāāāāāāāāāā
Currently Supported (64% market coverage):
- ā
TypeScript (.ts, .tsx) - AST-based pattern extraction.js
- ā
JavaScript (, .jsx) - AST-based pattern extraction.py
- ā
Python () - Function/class pattern extraction, similarity scoring
Roadmap:
- š Java (Q3 2026) - Method/class patterns, Spring annotations
- š Go (Q4 2026) - Function patterns, interface implementations
- š Rust (Q4 2026) - Function/trait patterns, macro detection
- š C# (Q1 2027) - Method/class patterns, LINQ queries
Zero config, works out of the box:
`bashRun without installation (recommended)
npx @aiready/pattern-detect ./src
$3
Input: Path to your source code directory
`bash
aiready-patterns ./src
`Output: Terminal report + optional JSON file (saved to
.aiready/ directory)
`
š Duplicate Pattern Analysis
āāāāāāāāāāāāāāāāāāāāāāāāāāāāā
š Files analyzed: 47
ā ļø Duplicate patterns: 12 files with 23 issues
š° Wasted tokens: 8,450CRITICAL (6 files)
src/handlers/users.ts - 4 duplicates (1,200 tokens)
src/handlers/posts.ts - 3 duplicates (950 tokens)
`$3
- ā
Auto-excludes test files (
/.test., /.spec., /__tests__/)
- ā
Auto-excludes build outputs (dist/, build/, .next/)
- ā
Auto-excludes dependencies (node_modules/)
- ā
Adaptive threshold: Adjusts similarity detection based on codebase size
- ā
Pattern classification: Automatically categorizes duplicates (API handlers, validators, etc.)> Override defaults with
--include-tests or --exclude as neededšÆ What It Does
AI tools generate similar code in different ways because they lack awareness of your codebase patterns. This tool:
- Semantic detection: Finds functionally similar code (not just copy-paste) using Jaccard similarity on AST tokens
- Pattern classification: Groups duplicates by type (API handlers, validators, utilities, etc.)
- Token cost analysis: Shows wasted AI context budget
- Refactoring guidance: Suggests specific fixes per pattern type
$3
The tool uses Jaccard similarity to compare code semantically:
1. Parses TypeScript/JavaScript files into Abstract Syntax Trees (AST)
2. Extracts semantic tokens (identifiers, operators, keywords) from each function
3. Calculates Jaccard similarity between token sets:
|A ā© B| / |A āŖ B|
4. Groups similar functions above the similarity thresholdThis approach catches duplicates even when variable names or minor logic differs.
$3
`
š Files analyzed: 47
ā Duplicate patterns found: 23
š° Token cost (wasted): 8,450š api-handler 12 patterns
ā validator 8 patterns
š§ utility 3 patterns
1. 87% š api-handler
src/api/users.ts:15 ā src/api/posts.ts:22
432 tokens wasted
ā Create generic handler function
`āļø Key Options
`bash
Basic usage
aiready patterns ./srcFocus on obvious duplicates
aiready patterns ./src --similarity 0.9Include smaller patterns
aiready patterns ./src --min-lines 3Export results (saved to .aiready/ by default)
aiready patterns ./src --output jsonOr specify custom path
aiready patterns ./src --output json --output-file custom-report.json
`> š Output Files: By default, all output files are saved to the
.aiready/ directory in your project root. You can override this with --output-file.šļø Tuning Guide
$3
| Parameter | Default | Effect | Use When |
|-----------|---------|--------|----------|
|
--similarity | 0.4 | Similarity threshold (0-1) | Want more/less sensitive detection |
| --min-lines | 5 | Minimum lines per pattern | Include/exclude small functions |
| --min-shared-tokens | 8 | Tokens that must match | Control comparison strictness |$3
Want more results? (catch subtle duplicates)
`bash
Lower similarity threshold
aiready patterns ./src --similarity 0.3Include smaller functions
aiready patterns ./src --min-lines 3Both together
aiready patterns ./src --similarity 0.3 --min-lines 3
`Want fewer but higher quality results? (focus on obvious duplicates)
`bash
Higher similarity threshold
aiready patterns ./src --similarity 0.8Larger patterns only
aiready patterns ./src --min-lines 10
`Analysis too slow? (optimize for speed)
`bash
Focus on substantial functions
aiready patterns ./src --min-lines 10Reduce comparison candidates
aiready patterns ./src --min-shared-tokens 12
`$3
| Adjustment | More Results | Faster | Higher Quality | Tradeoff |
|------------|--------------|--------|----------------|----------|
| Lower
--similarity | ā
| ā | ā | More false positives |
| Lower --min-lines | ā
| ā | ā | Includes trivial duplicates |
| Higher --similarity | ā | ā
| ā
| Misses subtle duplicates |
| Higher --min-lines | ā | ā
| ā
| Misses small but important patterns |$3
First run (broad discovery):
`bash
aiready patterns ./src # Default settings
`Focus on critical issues (production ready):
`bash
aiready patterns ./src --similarity 0.8 --min-lines 8
`Catch everything (comprehensive audit):
`bash
aiready patterns ./src --similarity 0.3 --min-lines 3
`Performance optimization (large codebases):
`bash
aiready patterns ./src --min-lines 10 --min-shared-tokens 10
`š Configuration File
Create an
aiready.json or aiready.config.json file in your project root:`json
{
"scan": {
"include": ["src/*/.{ts,tsx,js,jsx}"],
"exclude": ["/.test.", "/dist/**"]
},
"tools": {
"pattern-detect": {
"minSimilarity": 0.6,
"minLines": 8,
"maxResults": 20,
"minSharedTokens": 10,
"maxCandidatesPerBlock": 100
}
},
"output": {
"format": "console",
"file": ".aiready/pattern-report.json"
}
}
`Configuration Options:
| Option | Type | Default | Description |
|--------|------|---------|-------------|
|
minSimilarity | number | 0.4 | Similarity threshold (0-1) |
| minLines | number | 5 | Minimum lines to consider |
| maxResults | number | 10 | Max results to display in console |
| minSharedTokens | number | 8 | Min tokens that must match |
| maxCandidatesPerBlock | number | 100 | Performance tuning limit |
| approx | boolean | true | Use approximate candidate selection |
| severity | string | 'all' | Filter: 'critical', 'high', 'medium', 'all' |Use the unified CLI for all AIReady tools:
`bash
npm install -g @aiready/cliPattern detection
aiready patterns ./srcContext analysis (token costs, fragmentation)
aiready context ./srcConsistency checking (naming, patterns)
aiready consistency ./srcFull codebase analysis
aiready scan ./src
``Related packages:
- @aiready/cli - Unified CLI with all tools
- @aiready/context-analyzer - Context window cost analysis
- @aiready/consistency - Consistency checking
Try AIReady tools online and optimize your codebase: getaiready.dev
---
Made with š by the AIReady team | Website