Ultra-fast content moderation with SymSpell fuzzy matching. Multi-language profanity detection with 100x performance improvement. 3,600+ entries across 19 languages.
npm install content-shieldbash
npm install content-shield
`
`bash
yarn add content-shield
`
`bash
pnpm add content-shield
`
Quick Start
$3
`typescript
import { detect, filter, isClean, configure } from 'content-shield'
import { EN } from 'content-shield/languages/en'
// Configure once with language data
await configure({ languageData: { en: EN } })
// Quick naughty word check
const isProfane = !(await isClean('Your text here'))
// Get detailed analysis
const result = await detect('Your text here')
console.log(result.hasProfanity, result.matches)
// Filter naughty content
const cleanText = await filter('Your text here')
`
$3
`typescript
import {
ContentShieldDetector,
SeverityLevel,
ProfanityCategory,
FilterMode
} from 'content-shield'
import { EN, ES, FR } from 'content-shield/languages'
// Create a custom multi-language detector
const detector = new ContentShieldDetector({
languages: ['en', 'es', 'fr'],
languageData: { en: EN, es: ES, fr: FR },
minSeverity: SeverityLevel.MODERATE,
categories: [
ProfanityCategory.HATE_SPEECH,
ProfanityCategory.VIOLENCE
],
fuzzyMatching: true,
fuzzyThreshold: 0.8
})
// Analyze text
const analysis = await detector.analyze('Text to analyze')
// Filter with different modes
const censored = await detector.filter(text, FilterMode.CENSOR) // "f*"
const removed = await detector.filter(text, FilterMode.REMOVE) // ""
const replaced = await detector.filter(text, FilterMode.REPLACE) // "[filtered]"
`
Bundle Size
ContentShield is optimized for tree-shaking - only the languages you import are included in your bundle!
- English only: ~407KB (code + EN data)
- 3 languages: ~671KB (code + 3 languages)
- All 17 languages: ~2.3MB (code + all data)
This means your users only download what they need. Import Spanish? Only Spanish data is bundled. Perfect for keeping those bundle sizes clean (unlike the words we're detecting)! 📦✨
Common Use Cases
$3
`typescript
import { detect, filter, FilterMode, configure } from 'content-shield'
import { EN } from 'content-shield/languages/en'
// Configure once at app startup
await configure({ languageData: { en: EN } })
async function moderateMessage(message: string) {
const result = await detect(message)
if (result.hasProfanity) {
// Uh oh, someone's been naughty!
console.log(Naughty words detected: ${result.totalMatches} matches)
// Return filtered message
return await filter(message, FilterMode.CENSOR)
}
return message
}
`
$3
`typescript
import { isClean, configure } from 'content-shield'
import { EN } from 'content-shield/languages/en'
// Configure once at app startup
await configure({ languageData: { en: EN } })
async function validateUsername(username: string) {
const clean = await isClean(username)
if (!clean) {
throw new Error('Username contains naughty content')
}
return username
}
`
$3
`typescript
import { ContentShieldDetector, SeverityLevel } from 'content-shield'
import { EN, ES } from 'content-shield/languages'
const detector = new ContentShieldDetector({
minSeverity: SeverityLevel.HIGH, // Only catch the naughtiest words
languages: ['en', 'es'],
languageData: { en: EN, es: ES }
})
async function filterUserContent(content: string) {
const result = await detector.analyze(content)
if (result.maxSeverity >= SeverityLevel.SEVERE) {
// Block content entirely
return null
} else if (result.hasProfanity) {
// Filter but allow
return await detector.filter(content)
}
return content
}
`
API Reference
$3
- detect(text: string): Promise - Analyze text for naughty words
- filter(text: string, mode?: FilterMode): Promise - Filter naughty words from text
- isClean(text: string): Promise - Check if text is squeaky clean
$3
`typescript
const detector = new ContentShieldDetector(config)
await detector.analyze(text, options) // Full analysis
await detector.isProfane(text) // Boolean check
await detector.filter(text, mode) // Filter text
detector.updateConfig(newConfig) // Update configuration
detector.getConfig() // Get current config
`
$3
`typescript
interface DetectorConfig {
languages: LanguageCode[] // Languages to detect
minSeverity: SeverityLevel // Minimum severity to flag
categories: ProfanityCategory[] // Categories to detect
fuzzyMatching: boolean // Enable fuzzy matching
fuzzyThreshold: number // Fuzzy matching threshold (0-1)
customWords: CustomWord[] // Additional words to detect
whitelist: string[] // Words to never flag
detectAlternateScripts: boolean // Detect in other alphabets
normalizeText: boolean // Normalize before detection
replacementChar: string // Character for censoring
preserveStructure: boolean // Keep word structure when censoring
}
`
$3
`typescript
enum SeverityLevel {
LOW = 1, // Mildly naughty
MODERATE = 2, // Pretty naughty
HIGH = 3, // Very naughty
SEVERE = 4 // Extremely naughty (the naughtiest!)
}
`
$3
`typescript
enum ProfanityCategory {
GENERAL = 'general',
SEXUAL = 'sexual',
VIOLENCE = 'violence',
HATE_SPEECH = 'hate_speech',
DISCRIMINATION = 'discrimination',
SUBSTANCE_ABUSE = 'substance_abuse',
RELIGIOUS = 'religious',
POLITICAL = 'political'
}
`
Language Support
ContentShield speaks 17 languages fluently (and knows all the naughty words in each):
- 🇺🇸 English (en) - 714 entries
- 🇯🇵 Japanese (ja) - 247 entries
- 🇰🇷 Korean (ko) - 240 entries
- 🇨🇳 Chinese (zh) - 230 entries
- 🇳🇱 Dutch (nl) - 230 entries
- 🇫🇷 French (fr) - 229 entries
- 🇮🇹 Italian (it) - 229 entries
- 🇩🇪 German (de) - 226 entries
- 🇪🇸 Spanish (es) - 221 entries
- 🇵🇹 Portuguese (pt) - 218 entries
- 🇷🇺 Russian (ru) - 215 entries
- 🇵🇱 Polish (pl) - 204 entries
- 🇹🇷 Turkish (tr) - 203 entries
- 🇮🇱 Hebrew (he) - 200 entries
- 🇸🇪 Swedish (sv) - 179 entries
- 🇸🇦 Arabic (ar) - 105 entries
- 🇮🇳 Hindi (hi) - 101 entries
Total: 3,600+ naughty words across all languages 🚫
Use 'auto' for automatic language detection, or specify exact languages for better performance.
Custom Words & Whitelisting
`typescript
import { ContentShieldDetector, SeverityLevel, ProfanityCategory } from 'content-shield'
import { EN } from 'content-shield/languages/en'
const detector = new ContentShieldDetector({
languages: ['en'],
languageData: { en: EN },
customWords: [
{
word: 'frack', // Custom sci-fi profanity
language: 'en',
severity: SeverityLevel.MODERATE,
categories: [ProfanityCategory.GENERAL],
variations: ['fr@ck', 'fr4ck'],
caseSensitive: false
}
],
whitelist: ['hello', 'world'] // Never flag these words
})
`
Filter Modes
Choose how to handle those naughty words:
`typescript
enum FilterMode {
CENSOR = 'censor', // Replace with (f**)
REMOVE = 'remove', // Remove entirely (poof!)
REPLACE = 'replace', // Replace with [filtered]
DETECT_ONLY = 'detect_only' // Just detect, don't modify
}
`
Language-Specific Detectors
`typescript
import {
createEnglishDetector,
createSpanishDetector,
createMultiLanguageDetector
} from 'content-shield'
import { EN, ES, FR } from 'content-shield/languages'
const englishOnly = createEnglishDetector({ languageData: { en: EN } })
const spanishOnly = createSpanishDetector({ languageData: { es: ES } })
const multiLang = createMultiLanguageDetector(
['en', 'es', 'fr'],
{ languageData: { en: EN, es: ES, fr: FR } }
)
`
Performance
ContentShield delivers exceptional performance:
- Detection Speed: ~14,000 words/second
- Large Text Matching: ~15ms for 10,000 words
- Batch Processing: 555,556 words/second throughput
- Memory Efficiency: ~226 bytes per word in trie structure
- Fuzzy Matching: Configurable threshold for speed vs accuracy
- Language Detection: Fast language identification
- Tree-shaking: Import only what you need
- Caching: Intelligent caching with sub-millisecond cached lookups
Development
`bash
Install dependencies
npm install
Build the library
npm run build
Run tests
npm test
Run with coverage
npm run test:coverage
Lint code
npm run lint
Format code
npm run format
`
Contributing
Contributions are welcome! This project benefits from community input. Here's how you can help:
$3
- Add new languages - Help expand multi-language support
- Improve detection accuracy - Suggest new words or fix false positives
- Performance optimizations - Make the library even faster
- Documentation - Improve examples, guides, and API docs
- Bug reports - Found an issue? Let us know!
- Feature requests - Have an idea? Open a discussion!
$3
1. Fork the repository
2. Create a feature branch (git checkout -b feature/amazing-feature)
3. Make your changes
4. Add tests for new functionality
5. Run the test suite (pnpm test)
6. Commit your changes (git commit -m 'Add amazing feature')
7. Push to the branch (git push origin feature/amazing-feature)
8. Submit a pull request
$3
Adding a new language? Great! Here's what we need:
- Profanity entries in /data/languages/{code}/profanity.json`