A lightweight package to detect and filter profanity, especially Indian bad words.
npm install cleanwordalwaysAllow and alwaysBlock word lists
sh
npm install cleanword
`
---
Usage
$3
`js
const { cleanText } = require('cleanword');
const options = {
language: ['english', 'hindi'],
grawlixChar: '@',
alwaysAllow: ['kutto'],
alwaysBlock: ['test', 'what'],
};
const cleaned = cleanText('This is a test sentence with kutto and what.', options);
console.log(cleaned); // This is a @@@@ sentence with kutto and @@@@..
`
$3
`ts
import { cleanText } from 'cleanword';
interface CleanTextOptions {
language: string[];
grawlixChar: string;
alwaysAllow: string[];
alwaysBlock: string[];
}
const options: CleanTextOptions = {
language: ['english', 'hindi'],
grawlixChar: '@',
alwaysAllow: ['kutto'],
alwaysBlock: ['test', 'what'],
};
const cleaned: string = cleanText('This is a test sentence with kutto and what.', options);
console.log(cleaned); // This is a @@@@ sentence with kutto and @@@@.
`
$3
#### cleanText(text, options)
- text (string): The input string to clean.
- options (object, optional):
- language: string | string[] — Language(s) to check (default: 'hindi').
- grawlixChar: string — Character to use for censorship (default: '*').
- alwaysAllow: string[] — Words that should never be censored, even if abusive.
- alwaysBlock: string[] — Words that should always be censored, even if not abusive.
- customAbuseSet: Set — Custom set of abusive words (for advanced use/testing).
Returns: The cleaned string with abusive words replaced by the grawlix character.
Config Options
| Option | Type | Description |
|----------------|----------------|-------------|
| language | string/string[] | Languages to check (e.g. 'hindi', 'english', 'bengali', 'urdu') |
| grawlixChar | string | Character to use for censorship (default: '*') |
| alwaysAllow | string[] | Words to never censor |
| alwaysBlock | string[] | Words to always censor |
| customAbuseSet | Set | Custom abusive word set (advanced/testing) |
---
Supported Languages
- Hindi
- English
- Assamese
- Bengali
- Bhojpuri
- Marathi
- Chhattisgarhi
- Gujarati
- Haryanvi
- Kannada
- Kashmiri
- Konkani
- Ladakhi
- Malayalam
- Manipuri
- Marwari
- Nepali
- Odia
- Punjabi
- Rajasthani
- Tamil
- Telugu
- Urdu
You can specify one or more languages using the language option. Example:
`js
cleanText('some text', { language: ['hindi', 'english'] });
`
---
Contributing
1. Fork this repository and clone your fork.
2. Install dependencies:
`sh
npm install
`
3. Add or improve abusive word lists in src/abuse_words.js.
4. Add or update tests in Test/cleanText.test.js.
5. Run tests:
`sh
npm test
``