detector of copy/paste in files
npm install jscpd





> Copy/paste detector for programming source code, supports 150+ formats.
Copy/paste is a common technical debt on a lot of projects. The jscpd gives the ability to find duplicated blocks implemented on more than 150 programming languages and digital formats of documents.
The jscpd tool implements Rabin-Karp algorithm for searching duplications.
- Features
- Getting started
- Installation
- Usage
- JSCPD Server
- Options
- Config File
- Ignored Blocks
- Reporters
- HTML
- Badge
- PMD CPD XML
- JSON
- API
- Changelog
- Who uses jscpd
- Contributors
- Backers
- Sponsors
- License
or sections in htmlbash
$ npm install -g jscpd
`
$3
`bash
$ npx jscpd /path/to/source
`
or`bash
$ jscpd /path/to/code
`
or`bash
$ jscpd --pattern "src/*/.js"
`JSCPD Server
If you need a standalone application that provides an API for detecting code duplication, you can use jscpd-server.
It allows you to integrate duplication detection into your services or tools via HTTP API.
Options
$3
Glob pattern for find files to detect
- Cli options:
--pattern, -p
- Type: string
- Default: "*/"Example:
`bash
$ jscpd --pattern "*/.js"
`$3
Minimal block size of code in tokens. The block of code less than
min-tokens will be skipped. - Cli options:
--min-tokens, -k
- Type: number
- Default: 50 This option is called
minTokens in the config file.$3
Minimal block size of code in lines. The block of code less than
min-lines will be skipped. - Cli options:
--min-lines, -l
- Type: number
- Default: 5
$3
Maximum file size in lines. The file bigger than
max-lines will be skipped. - Cli options:
--max-lines, -x
- Type: number
- Default: 1000
$3
Maximum file size in bytes. The file bigger than
max-size will be skipped. - Cli options:
--max-size, -z
- Type: string
- Default: 100kb
$3
The threshold for duplication level, check if current level of duplications bigger than threshold jscpd exit with error.
- Cli options:
--threshold, -t
- Type: number
- Default: null
$3
The path to configuration file. The config should be in
json format. Supported options in config file can be the same with cli options. - Cli options:
--config, -c
- Type: path
- Default: null
$3
The option with glob patterns to ignore from analyze. For multiple globs you can use comma as separator.
Example:
`bash
$ jscpd --ignore "/.min.js,/.map" /path/to/files
`
- Cli options: --ignore, -i
- Type: string
- Default: null
$3
The list of reporters. Reporters use for output information of clones and duplication process.Available reporters:
- console - report about clones to console;
- consoleFull - report about clones to console with blocks of code;
- json - output
jscpd-report.json file with clones report in json format;
- xml - output jscpd-report.xml file with clones report in xml format;
- csv - output jscpd-report.csv file with clones report in csv format;
- markdown - output jscpd-report.md file with clones report in markdown format;
- html - generate html report to html/ folder;
- sarif - generate a report in SARIF format (https://github.com/oasis-tcs/sarif-spec), save it to jscpd-sarif.json file;
- verbose - output a lot of debug information to console;> Note: A reporter can be developed manually, see @jscpd/finder package.
- Cli options:
--reporters, -r
- Type: string
- Default: console
$3
The path to directory for reports. JSON and XML reports will be saved there.
- Cli options:
--output, -o
- Type: path
- Default: ./report/$3
The mode of detection quality.
- strict - use all types of symbols as token, skip only blocks marked as ignored.
- mild - skip blocks marked as ignored and new lines and empty symbols.
- weak - skip blocks marked as ignored and new lines and empty symbols and comments.> Note: A mode can be developed manually, see API section.
- Cli options:
--mode, -m
- Type: string
- Default: mild
$3
The list of formats to detect for duplications. Available over 150 formats.
Example:
`bash
$ jscpd --format "php,javascript,markup,css" /path/to/files
` - Cli options:
--format, -f
- Type: string
- Default: {all formats}
$3
Get information about authors and dates of duplicated blocks from git. - Cli options:
--blame, -b
- Type: boolean
- Default: false
$3
Don't write a lot of information to a console.Example:
`
$ jscpd /path/to/source --silent
Duplications detection: Found 60 exact clones with 3414(46.81%) duplicated lines in 100 (31 formats) files.
Execution Time: 1381.759ms
`
- Cli options: --silent, -s
- Type: boolean
- Default: false
$3
Use the absolute path in reports.
- Cli options:
--absolute, -a
- Type: boolean
- Default: false
$3
Ignore case of symbols in code (experimental).
- Cli options:
--ignoreCase
- Type: boolean
- Default: false$3
Do not follow symlinks. - Cli options:
--noSymlinks, -n
- Type: boolean
- Default: false$3
Use for detect duplications in different folders only. For correct usage of --skipLocal option you should provide list of path's with more than one item.Example:
`bash
jscpd --skipLocal /path/to/folder1/ /path/to/folder2/
`
will detect clones in separate folders only, clones from same folder will be skipped.
- Cli options:
--skipLocal
- Type: boolean
- Default: false$3
Define the list of formats with file extensions. Available over 150 formats.In following example jscpd will analyze files
.es and .es6 as javascript and *.dt files as dart:
`bash
$ jscpd --formats-exts javascript:es,es6;dart:dt /path/to/code
`
> Note: formats defined in the option redefine default configuration, you should define all need formats manually or create two configuration for run jscpd - Cli options:
--formats-exts
- Type: string
- Default: null$3
Stores used for collect information about code, by default all information collect in memory.
Available stores:
- leveldb - leveldb store all data to files. The store recommended as store for big repositories. Should install @jscpd/leveldb-store before;
> Note: A store can be developed manually, see @jscpd/finder package and @jscpd/leveldb-store as example.
- Cli options:
--store
- Type: string
- Default: null$3
Ignore code blocks matching the regexp patterns. - Cli options:
--ignore-pattern
- Type: string
- Default: nullExample:
`
$ jscpd /path/to/source --ignore-pattern "import.from\s'.*'"
`
Excludes import statements from the calculation.Config File
Put
.jscpd.json file in the root of the projects:
`json
{
"threshold": 0,
"reporters": ["html", "console", "badge"],
"ignore": ["/__snapshots__/"],
"absolute": true
}
`Also you can use section in
package.json:`json
{
...
"jscpd": {
"threshold": 0.1,
"reporters": ["html", "console", "badge"],
"ignore": ["/__snapshots__/"],
"absolute": true,
"gitignore": true
}
...
}
`$3
By default, the tool exits with code 0 even when code duplications were
detected. This behaviour can be changed by specifying a custom exit
code for error states.
Example:
`bash
jscpd --exitCode 1 .
`- Cli options:
--exitCode
- Type: number
- Default: 0
Ignored Blocks
Mark blocks in code as ignored:
`javascript
/ jscpd:ignore-start /
import lodash from 'lodash';
import React from 'react';
import {User} from './models';
import {UserService} from './services';
/ jscpd:ignore-end /
``html
`Reporters
$3
$3
!jscpdMore info jscpd-badge-reporter
$3
`xml
`
$3
`json
{
"duplicates": [{
"format": "javascript",
"lines": 27,
"fragment": "...code fragment... ",
"tokens": 0,
"firstFile": {
"name": "tests/fixtures/javascript/file2.js",
"start": 1,
"end": 27,
"startLoc": {
"line": 1,
"column": 1
},
"endLoc": {
"line": 27,
"column": 2
}
},
"secondFile": {
"name": "tests/fixtures/javascript/file1.js",
"start": 1,
"end": 24,
"startLoc": {
"line": 1,
"column": 1
},
"endLoc": {
"line": 24,
"column": 2
}
}
}],
"statistic": {
"detectionDate": "2018-11-09T15:32:02.397Z",
"formats": {
"javascript": {
"sources": {
"/path/to/file": {
"lines": 24,
"sources": 1,
"clones": 1,
"duplicatedLines": 26,
"percentage": 45.33,
"newDuplicatedLines": 0,
"newClones": 0
}
},
"total": {
"lines": 297,
"sources": 1,
"clones": 1,
"duplicatedLines": 26,
"percentage": 45.33,
"newDuplicatedLines": 0,
"newClones": 0
}
}
},
"total": {
"lines": 297,
"sources": 6,
"clones": 5,
"duplicatedLines": 26,
"percentage": 45.33,
"newDuplicatedLines": 0,
"newClones": 0
}
}
}
`
API
For integration copy/paste detection to your application you can use programming API:
jscpd Promise API
`typescript
import {IClone} from '@jscpd/core';
import {jscpd} from 'jscpd';const clones: Promise = jscpd(process.argv);
`jscpd async/await API
`typescript
import {IClone} from '@jscpd/core';
import {jscpd} from 'jscpd';
(async () => {
const clones: IClone[] = await jscpd(['', '', __dirname + '/../fixtures', '-m', 'weak', '--silent']);
console.log(clones);
})();`detectClones API
`typescript
import {detectClones} from "jscpd";(async () => {
const clones = await detectClones({
path: [
__dirname + '/../fixtures'
],
silent: true
});
console.log(clones);
})()
`detectClones with persist store
`typescript
import {detectClones} from "jscpd";
import {IMapFrame, MemoryStore} from "@jscpd/core";(async () => {
const store = new MemoryStore();
await detectClones({
path: [
__dirname + '/../fixtures'
],
}, store);
await detectClones({
path: [
__dirname + '/../fixtures'
],
silent: true
}, store);
})()
``In case of deep customisation of detection process you can build your own tool:
If you are going to detect clones in file system you can use @jscpd/finder for make a powerful detector.
In case of detect clones in browser or not node.js environment you can build your own solution base on @jscpd/code
This project exists thanks to all the people who contribute.
Thank you to all our backers! 🙏 [Become a backer]
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [Become a sponsor]
MIT © Andrey Kucherenko