Powerful string matching insensitive to diacritics, special characters, symbols and case
npm install shoetest
Advanced Unicode-aware string matching library for TypeScript and JavaScript
Powerful fuzzy string matching that's insensitive to diacritics, special characters, symbols and case. Perfect for search functionality, data deduplication, and user input tolerance.
- 🌍 Unicode-aware - Handles diacritics, accents, and special characters
- 🔤 Case insensitive - Matches regardless of character case
- 🎯 Fuzzy matching - Flexible pattern matching with configurable rules
- 📝 TypeScript support - Full type definitions included
- 🚀 Modern ES modules - Works with both CommonJS and ES modules
``bash`
npm install shoetest
`javascript
// CommonJS
const shoetest = require('shoetest');
// ES Modules
import shoetest from 'shoetest';
// Basic fuzzy matching
shoetest.test('cafe', 'café'); // → true
shoetest.test('hello', 'HELLO'); // → true
shoetest.test('naif', 'naïf'); // → true
`
javascript
const text1 = 'heļlṏ, wɵrḻɖ!';
const text2 = 'Algæ Britannicæ';
const text3 = 'The Crème de la Crème de la Crème!';// Test if pattern exists
shoetest.test('hello world', text1); // → true
shoetest.test('hello world', [text1, text2, text3]); // → true
// Extract matching substrings
shoetest.match('Helló (wơrLd)', text1); // → ['heļlṏ, wɵrḻɖ']
shoetest.match('algae britannicae', text2); // → ['Algæ Britannicæ']
shoetest.match('creme', text3); // → ['Crème', 'Crème', 'Crème']
`$3
`javascript
// Replace with capture groups
shoetest.replace('creme', '$1', text3);
// → 'The Crème de la Crème de la Crème!'// Batch replacement
shoetest.replace('creme', 'Crème fraîche', [text1, text2, text3]);
// → ['heļlṏ, wɵrḻɖ!', 'Algæ Britannicæ', 'The Crème fraîche de la Crème fraîche de la Crème fraîche!']
`$3
`javascript
// Simplify text (remove diacritics)
shoetest.simplify('Ƀuffalỗ buḟḟaḻở Ḅuƒfalo ḅuffȃlỗ buffalȏ bǖffaḻồ Ƀⓤffalo buƒfalɵ');
// → 'Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo'// Complexify text (add random diacritics)
shoetest.complexify('This is Mars!');
// → 'Thíṣ ịṥ Mârs!' (varies each time)
`$3
`javascript
// Precise matching with word boundaries
const preciseOptions = {
charCase: true, // Case-sensitive matching
strict: false, // Allow similar characters (s → $)
diacritics: true, // Match accents exactly
symbols: false, // Ignore punctuation differences
begin: '\\b', // Word boundary at start
end: '\\b' // Word boundary at end
};shoetest.test('creme de la creme', text3, preciseOptions); // → false (case mismatch)
shoetest.test('Creme de la Creme', text3, preciseOptions); // → true
// Boundaries vs begin/end comparison
shoetest.test('Alg', text2); // → true (partial match)
shoetest.test('Alg', text2, { begin: '\\b', end: '\\b' }); // → false (requires word boundary)
shoetest.test('hello world', 'helloworld', { boundaries: false }); // → true (ignores whitespace)
`🔧 API Reference
$3
Tests if a pattern exists in the target text(s).Parameters:
-
pattern (string) - Search pattern to look for
- text (string | string[]) - Target text(s) to search within
- options (object, optional) - Matching configurationReturns:
boolean | undefined$3
Extracts all matching substrings from text(s).Parameters:
-
pattern (string) - Search pattern to find
- text (string | string[]) - Target text(s) to search within
- options (object, optional) - Matching configurationReturns:
string[] | undefined$3
Replaces pattern matches with new content.Parameters:
-
pattern (string) - Search pattern to replace
- replacement (string) - Replacement text (supports $1, $2, etc.)
- text (string | string[]) - Target text(s) to modify
- options (object, optional) - Matching configurationReturns:
string | string[] | undefined$3
Removes diacritics and accents from text.Parameters:
-
text (string) - Text to simplifyReturns:
string | undefined$3
Adds random diacritics and character variations.Parameters:
-
text (string) - Text to add variations toReturns:
string | undefined$3
Creates a fuzzy matching regular expression.Parameters:
-
pattern (string) - Pattern to create regex for
- options (object, optional) - Matching configurationReturns:
RegExp | undefined⚙️ Configuration Options
$3
Default: trueUse strict character matching. When
false, allows similar-looking characters like s with $ or e with €.$3
Default: falseMatch diacritics exactly. When
true, accents must match precisely.$3
Default: falseMatch character case exactly. When
true, enables case-sensitive matching.$3
Default: falseMatch symbols exactly. When
true, punctuation must match.$3
Default: falseMatch whitespace exactly. When
true, spaces must be identical.$3
Default: trueControls whitespace handling in patterns. When
true (default), whitespace in the pattern must match whitespace in the target text. When false, whitespace in the pattern is ignored, allowing matches across word boundaries.Important: This option does NOT use regex word boundaries (
\b). For true word boundaries, use the begin and end options with \\b.$3
Default: ''Custom regular expression pattern to prepend. Escape when necessary (e.g.,
\\b).$3
Default: ''Custom regular expression pattern to append. Escape when necessary (e.g.,
\\b).🧪 Testing
`bash
npm test # Run all tests
npm run test:watch # Run tests in watch mode
npm run test:coverage # Generate coverage report
`🛠️ Development
`bash
npm run build # Build the library
npm run dev # Build in watch mode
npm run clean # Clean dist and coverage directories
npm run lint # Run ESLint
npm run lint:fix # Run ESLint and fix issues
npm run typecheck # Run TypeScript type checking
npm run validate # Run typecheck, lint, and tests
``MIT
Pull requests are welcome! If you add functionality, please add unit tests to cover it.
If you wish to update the reference list, only add special characters translated to 3 or less basic latin characters.