LLM Guard
!
LLM Guard Logo
Secure your LLM prompts with confidence
A TypeScript library for validating and securing LLM prompts. This package provides various guards to protect against common LLM vulnerabilities and misuse.





Features
- Validate LLM prompts for various security concerns
- Support for multiple validation rules:
- PII detection
- Jailbreak detection
- Profanity filtering
- Prompt injection detection
- Relevance checking
- Toxicity detection
- Batch validation support
- CLI interface
- TypeScript support
Installation
``
bash
npm install llm-guard
`
Usage
$3
`
typescript
import { LLMGuard } from 'llm-guard';
const guard = new LLMGuard({
pii: true,
jailbreak: true,
profanity: true,
promptInjection: true,
relevance: true,
toxicity: true
});
// Single prompt validation
const result = await guard.validate('Your prompt here');
console.log(result);
// Batch validation
const batchResult = await guard.validateBatch([
'First prompt',
'Second prompt'
]);
console.log(batchResult);
`
$3
`
bash
Basic usage
npx llm-guard "Your prompt here"
With specific guards enabled
npx llm-guard --pii --jailbreak "Your prompt here"
With a config file
npx llm-guard --config config.json "Your prompt here"
Batch mode
npx llm-guard --batch '["First prompt", "Second prompt"]'
Show help
npx llm-guard --help
`
Configuration
You can configure which validators to enable when creating the LLMGuard instance:
`
typescript
const guard = new LLMGuard({
pii: true, // Enable PII detection
jailbreak: true, // Enable jailbreak detection
profanity: true, // Enable profanity filtering
promptInjection: true, // Enable prompt injection detection
relevance: true, // Enable relevance checking
toxicity: true, // Enable toxicity detection
customRules: { // Add custom validation rules
// Your custom rules here
},
relevanceOptions: { // Configure relevance guard options
minLength: 10, // Minimum text length
maxLength: 5000, // Maximum text length
minWords: 3, // Minimum word count
maxWords: 1000 // Maximum word count
}
});
``
Available Guards
$3
Detects personally identifiable information like emails, phone numbers, SSNs, credit card numbers, and IP addresses.
$3
Filters profanity and offensive language, including common character substitutions (like using numbers for letters).
$3
Detects attempts to bypass AI safety measures and ethical constraints, such as "ignore previous instructions" or "pretend you are".
$3
Identifies attempts to inject malicious instructions or override system prompts, including system prompt references and memory reset attempts.
$3
Evaluates the relevance and quality of the prompt based on length, word count, filler words, and repetitive content.
$3
Detects toxic, harmful, or aggressive content, including hate speech, threats, and discriminatory language.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request on GitHub. We appreciate any help with:
- Bug fixes
- New features
- Documentation improvements
- Code quality enhancements
- Test coverage
- Performance optimizations
$3
1. Fork the repository on GitHub
2. Create a new branch for your feature or bugfix
3. Make your changes
4. Write or update tests as needed
5. Ensure all tests pass
6. Submit a Pull Request with a clear description of the changes
For more complex changes, please open an issue first to discuss the proposed changes.
Documentation
For more detailed documentation, visit our
documentation site.