Showing 1-20 of 20 packages
Calculate the simhash value for a list of tokens
Vocabulary-based SimHash implementation for similarity detection
A Javascript implementation of Charikar's hash for identification of similar documents.
Command Line tool that compares two text files using simhash
Simhash implementation for detecting near-duplicate text using various hash functions like SipHash, MD5, and SHA256
Simhash implementation for detecting near-duplicate text using various hash functions like SipHash, MD5, and SHA256
Command Line tool that compares two text files using simhash
SimHash implementation for detecting near-duplicate text using SipHash-2- function
SimHash text clustering with OutRank outlier removal and Variation of Information analysis.
Command Line tool that compares two text files using simhash
Javascript implementation for `simhash` algorithm which is widely used by Google for massive web pages
Embedding Locality IDentifier - encode embeddings into sortable string IDs for vector search without vector stores, plus fast string similarity algorithms
A powerful toolkit for data structures and algorithms in TypeScript, designed for optimal performance and versatility. The toolkit provides implementations of various data structures and algorithms, with a focus on search and sort operations, caching, and
功能强大、高度可定制的回声洞插件。支持丰富的媒体类型、内容查重、AI分析、人工审核、用户昵称、数据迁移以及本地/S3 双重文件存储后端。
An HTTP wrapper around the article-parser module
Detect reusable/duplicate React Native code (components, hooks, styles, utils) and suggest refactors. Ships as a CLI + Node API.
This JS library contains my experiments around Google Chrome's Privacy Sandbox.
LightSearch
A lib to compare similarity of two strings
SimHash fingerprinting for fuzzy text deduplication - native C++ for Node.js