Enterprise-grade anomaly detection and observability platform for LLM applications - NPM CLI wrapper
npm install @llm-dev-ops/sentinel-cliEnterprise-Grade Anomaly Detection and Observability for LLM Applications


![Build]()


LLM-Sentinel is a production-ready, real-time anomaly detection and observability platform designed specifically for Large Language Model (LLM) applications. Built in Rust for maximum performance and reliability, it provides comprehensive monitoring, statistical anomaly detection, automated alerting, and deep observability to ensure the reliability, security, and cost-effectiveness of your LLM deployments.
Status: β Production Ready - Full implementation with enterprise-grade deployment infrastructure
- Key Features
- Detection Capabilities
- Deployment Options
- Quick Start
- Architecture
- Observability
- API Reference
- Example Producers
- Configuration
- Performance
- Contributing
- Z-Score Detection: Parametric anomaly detection for normally distributed metrics with configurable thresholds (default: 3.0Ο)
- IQR Detection: Non-parametric outlier detection using interquartile range (default: 1.5x multiplier)
- MAD Detection: Robust outlier detection using median absolute deviation (default: 3.5 threshold)
- CUSUM Detection: Cumulative sum change point detection for drift and regime shifts (default: 5.0 threshold, 0.5 drift)
- Multi-Dimensional Baselines: Per-service, per-model statistical baselines with automatic updates
- Configurable Sensitivity: Tune detection sensitivity for your specific use cases
Monitor all critical LLM metrics:
- Latency Spikes: Detect unusual response times (P50, P95, P99)
- Token Usage Anomalies: Monitor prompt and completion token consumption patterns
- Cost Anomalies: Track unexpected spending patterns and budget overruns
- Error Rate Spikes: Identify service degradation and failures
- Model Drift: Detect quality degradation over time
- Usage Patterns: Identify suspicious or abnormal usage behavior
- Throughput Changes: Monitor request rate variations
- 10,000+ events/second ingestion throughput
- Sub-5ms P50 latency for anomaly detection
- Lock-free concurrent baseline updates using DashMap
- Batch processing for optimal InfluxDB writes (100-record batches)
- Zero-copy parsing for maximum efficiency
- Async/await throughout for non-blocking I/O
- Memory-efficient streaming processing
- Zero unsafe code: Memory safety guaranteed by Rust's type system
- Comprehensive error handling: Type-safe Result propagation with detailed error context
- Graceful shutdown: Proper signal handling (SIGTERM, SIGINT) with resource cleanup
- Health checks: Liveness, readiness, and startup probes for Kubernetes
- Circuit breakers: Automatic failure detection and recovery
- Exponential backoff: Intelligent retry logic for transient failures
- Connection pooling: Efficient resource management
- RabbitMQ Integration: Topic-based routing with severity levels (info, warning, critical)
- Webhook Delivery: HTTP POST with HMAC-SHA256 signatures for verification
- Alert Deduplication: Configurable 5-minute window to prevent alert storms
- Retry Logic: Exponential backoff with configurable max attempts (default: 3)
- Priority Routing: Route critical alerts to different channels
- Batch Alerting: Optional batching for high-volume scenarios
- InfluxDB v3: Time-series storage for telemetry with automatic downsampling
- Moka Cache: High-performance in-memory cache (10,000 entry capacity)
- Redis Support: Distributed caching for multi-instance deployments
- Persistent Baselines: Baseline persistence to disk for quick restarts
- Query API: REST endpoints for historical data retrieval and analysis
- TTL Management: Automatic expiration of stale data (300s default)
- 50+ Prometheus Metrics: Comprehensive instrumentation of all subsystems
- 4 Pre-built Grafana Dashboards:
- Anomaly Detection Dashboard
- System Health Dashboard
- Performance Metrics Dashboard
- Alert Overview Dashboard
- 50+ Alert Rules: Production-ready Prometheus alerting covering all failure modes
- Distributed Tracing: OpenTelemetry support for request tracing
- Structured Logging: JSON logs with configurable levels (trace, debug, info, warn, error)
- Kubernetes-Ready: Production manifests with HPA, PDB, and network policies
- Helm Chart: Parameterized chart for easy deployment and upgrades
- Docker Images: Multi-stage builds with minimal attack surface (<50MB)
- Horizontal Scaling: Support for 3-10+ replicas with auto-scaling
- StatefulSet Support: Optional for baseline persistence
- Service Mesh Compatible: Works with Istio, Linkerd, Consul Connect
- Non-Root Containers: Runs as UID 1000 with dropped capabilities
- Read-Only Filesystem: Root filesystem mounted read-only
- Network Policies: Restrict ingress/egress to required services only
- Secret Management: Support for Kubernetes secrets and external secret stores
- PII Sanitization: Automatic detection and removal of sensitive data
- Audit Logging: Complete audit trail of all anomalies and alerts
- SBOM Generation: Software Bill of Materials for vulnerability tracking
LLM-Sentinel provides four complementary detection algorithms that can be enabled individually or in combination:
#### 1. Z-Score Detection (Parametric)
Best for: Metrics that follow a normal distribution
``yaml`
detection:
zscore:
threshold: 3.0 # Standard deviations from mean
sensitivity: "medium" # low, medium, high
metrics:
- latency_ms
- total_tokens
- cost_usd
Use Cases:
- Latency spike detection
- Token usage monitoring
- Cost anomaly detection
#### 2. IQR Detection (Non-Parametric)
Best for: Metrics with skewed distributions or outliers
`yaml`
detection:
iqr:
multiplier: 1.5 # IQR multiplier for outliers
metrics:
- latency_ms
- total_tokens
Use Cases:
- Robust outlier detection
- Handling non-normal distributions
- Resistant to extreme values
#### 3. MAD Detection (Robust)
Best for: Metrics requiring high robustness to outliers
`yaml`
detection:
mad:
threshold: 3.5 # MAD threshold
metrics:
- latency_ms
Use Cases:
- Ultra-robust detection
- Minimal false positives
- Gradual baseline shifts
#### 4. CUSUM Detection (Change Point)
Best for: Detecting sustained changes and drift
`yaml`
detection:
cusum:
threshold: 5.0 # Cumulative threshold
drift: 0.5 # Drift parameter
metrics:
- latency_ms
Use Cases:
- Model performance degradation
- Service quality drift
- Gradual system changes
- Adaptive Baselines: Automatic baseline updates every 60 seconds
- Multi-Dimensional: Separate baselines per service, model, and metric
- Configurable Window: 1000-sample sliding window (configurable)
- Minimum Samples: Require 10+ samples before detection (prevents cold-start false positives)
- Persistence: Save/load baselines from disk for fast restarts
Perfect for local development and testing:
`bashStart full environment (Kafka, InfluxDB, RabbitMQ, Redis, Prometheus, Grafana)
docker-compose up -d
Includes:
- Sentinel (3 replicas)
- Kafka cluster (3 brokers)
- InfluxDB v3
- RabbitMQ with management UI
- Redis
- Prometheus
- Grafana with pre-loaded dashboards
$3
Production-grade deployment with all manifests:
`bash
Apply all manifests
kubectl apply -f k8s/Or using kustomize
kubectl apply -k k8s/Check status
kubectl get pods -l app.kubernetes.io/name=sentinel
`Includes:
- Deployment with 3 replicas
- HorizontalPodAutoscaler (3-10 replicas, CPU/memory targets)
- PodDisruptionBudget (min 2 available)
- NetworkPolicy (restricted ingress/egress)
- ServiceMonitor (Prometheus integration)
- Ingress with TLS
$3
Easiest production deployment with full parameterization:
`bash
Install with default values
helm install sentinel ./helm/sentinel \
--set secrets.influxdbToken="your-token" \
--set secrets.rabbitmqPassword="your-password"Install with custom values
helm install sentinel ./helm/sentinel -f production-values.yamlUpgrade
helm upgrade sentinel ./helm/sentinelUninstall
helm uninstall sentinel
`Helm Chart Features:
- Full configuration parameterization (335+ options)
- External secrets support (AWS Secrets Manager, Vault, etc.)
- Multiple environment profiles (dev, staging, prod)
- Init containers for dependency checking
- Automatic ConfigMap/Secret generation
- Post-install notes with helpful commands
See Helm Chart README for complete documentation.
$3
Direct installation on Linux servers:
`bash
Build release binary
cargo build --releaseInstall
sudo cp target/release/sentinel /usr/local/bin/Create config
sudo mkdir -p /etc/sentinel
sudo cp config/sentinel.yaml /etc/sentinel/Create systemd service
sudo cp deployments/systemd/sentinel.service /etc/systemd/system/
sudo systemctl enable sentinel
sudo systemctl start sentinel
`Quick Start
$3
- Rust 1.75+ (for building from source)
- Kafka 2.8+ (message queue)
- InfluxDB v3 (time-series database)
- RabbitMQ 3.8+ (optional, for alerting)
- Redis 6.0+ (optional, for distributed caching)
$3
`bash
Clone repository
git clone https://github.com/llm-devops/llm-sentinel.git
cd llm-sentinelStart infrastructure
docker-compose up -d kafka influxdb rabbitmq redisBuild project
cargo build --releaseRun tests
cargo test --workspaceStart service
export INFLUXDB_TOKEN="your-token"
export RABBITMQ_PASSWORD="your-password"
./target/release/sentinel --config config/sentinel.yaml
`$3
`bash
Create namespace
kubectl create namespace sentinelCreate secrets
kubectl create secret generic sentinel-secrets \
--from-literal=influxdb-token="your-token" \
--from-literal=rabbitmq-password="your-password" \
-n sentinelDeploy with Helm
helm install sentinel ./helm/sentinel \
--namespace sentinel \
--set secrets.influxdbToken="your-token" \
--set secrets.rabbitmqPassword="your-password"Verify deployment
kubectl get pods -n sentinel
kubectl logs -f deployment/sentinel -n sentinelAccess dashboards
kubectl port-forward svc/sentinel 8080:8080 -n sentinel
open http://localhost:8080/metrics
`$3
`bash
Check health
curl http://localhost:8080/health/live
curl http://localhost:8080/health/readyView metrics
curl http://localhost:8080/metricsQuery recent anomalies
curl http://localhost:8080/api/v1/anomalies/recent?limit=10Query telemetry
curl "http://localhost:8080/api/v1/telemetry?service=chat-api&hours=1"
`Architecture
$3
`
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LLM Applications β
β ββββββββββββ ββββββββββββ ββββββββββββ ββββββββββββ β
β β OpenAI β β Claude β β Llama β β Custom β β
β β API β β API β β API β β LLM API β β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
βββββββββΌββββββββββββββΌββββββββββββββΌβββββββββββββββΌβββββββββββββββββ
β β β β
βββββββββββββββ΄ββββββββββββββ΄βββββββββββββββ
β
βΌ (Telemetry Events)
βββββββββββββββββββββββββββββββ
β Apache Kafka β
β (llm.telemetry topic) β
βββββββββββββββ¬ββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β LLM-SENTINEL β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Ingestion Layer (sentinel-ingestion) β β
β β ββββββββββββββββ ββββββββββββββββ βββββββββββββββββ β β
β β β Kafka ββββ>β OTLP/JSON ββββ>β Validation β β β
β β β Consumer β β Parsing β β & PII Filter β β β
β β ββββββββββββββββ ββββββββββββββββ βββββββββ¬ββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Detection Engine (sentinel-detection) β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β Baseline Manager (Multi-Dimensional) β β β
β β β Per Service Γ Model Γ Metric Statistical Baselines β β β
β β βββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ β β
β β β β β
β β βββββββββββββ βββββββββββββ βββββββββββββ βββββββββββββ β
β β β Z-Score β β IQR β β MAD β β CUSUM ββ β
β β β Detector β β Detector β β Detector β β Detector ββ β
β β βββββββ¬ββββββ βββββββ¬ββββββ βββββββ¬ββββββ ββββββ¬βββββββ β
β β ββββββββββββββββ΄ββββββββββββββββ΄βββββββββββββββ β β
β β β (Anomalies) β β
β ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ β
β β β
β ββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββββ β
β β βΌ β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β
β β β InfluxDB β β RabbitMQ β β Webhook β β β
β β β Storage β β Alerting β β Delivery β β β
β β β β β β β β β β
β β β β’ Telemetry β β β’ Topic β β β’ HMAC β β β
β β β β’ Anomalies β β Routing β β Signature β β β
β β β β’ Query API β β β’ Severity β β β’ Retry β β β
β β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β β
β β (sentinel-storage) (sentinel-alerting) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β REST API (sentinel-api) β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββββ β β
β β β Health β β Metrics β β Query β β β
β β β /health/ β β /metrics β β /api/v1/ β β β
β β ββββββββββββββ ββββββββββββββ ββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββ
β Prometheus β β Grafana β
β (Metrics) β β (Dashboards) β
ββββββββββββββββββββ ββββββββββββββββββββ
`$3
#### sentinel-core
- Error types and handling
- Configuration models
- Telemetry event types
- Shared utilities
#### sentinel-ingestion
- Kafka consumer with group management
- OTLP/JSON parsing
- Schema validation
- PII detection and sanitization
- Configurable message handling
#### sentinel-detection
- Statistical baseline management
- Four detection algorithms (Z-Score, IQR, MAD, CUSUM)
- Multi-dimensional baseline tracking
- Lock-free concurrent updates
- Baseline persistence
#### sentinel-storage
- InfluxDB v3 client with batch writes
- In-memory cache (Moka)
- Redis distributed cache
- Query API for historical data
- Automatic TTL management
#### sentinel-alerting
- RabbitMQ topic publisher
- Webhook HTTP delivery
- Alert deduplication (5-minute window)
- Exponential backoff retry
- HMAC signature generation
#### sentinel-api
- REST API server (Axum)
- Health check endpoints
- Prometheus metrics exporter
- Query endpoints for telemetry and anomalies
- CORS support
Observability
$3
Four production-ready dashboards included in
deployments/grafana/dashboards/:#### 1. Anomaly Detection Dashboard
- Anomaly detection rate by severity (timeseries)
- Total anomaly rate gauge
- Anomalies in last 24h (stat)
- Anomaly types distribution (bar chart)
- Top services by anomalies
- Detection latency percentiles (P50, P95, P99)
#### 2. System Health Dashboard
- Service status (up/down)
- Active instance count
- Uptime tracking
- Memory usage by instance
- CPU usage by instance
- Event processing throughput
- Error rates by component
#### 3. Performance Metrics Dashboard
- Events processed per second
- Detection P95 latency
- Cache hit rate percentage
- Pipeline throughput (Kafka β Detection β Storage)
- Storage write rate
- API request rate
#### 4. Alert Overview Dashboard
- Alerts sent (last hour)
- Alerts deduplicated (last hour)
- Alert failures (last hour)
- RabbitMQ publishes by severity
- Webhook delivery success rate
- Alert distribution by type
$3
50+ metrics exported at
/metrics:Ingestion Metrics:
-
sentinel_events_received_total - Total events consumed from Kafka
- sentinel_events_processed_total - Successfully processed events
- sentinel_ingestion_errors_total - Ingestion errors by type
- sentinel_validation_failures_total - Validation failures
- sentinel_kafka_messages_consumed_total - Kafka messages consumed
- sentinel_kafka_consumption_errors_total - Kafka errorsDetection Metrics:
-
sentinel_anomalies_detected_total - Anomalies by severity and detector
- sentinel_detection_latency_seconds - Detection latency histogram
- sentinel_detection_errors_total - Detection errors
- sentinel_baseline_updates_total - Baseline update count
- sentinel_baseline_samples - Current baseline sample countsStorage Metrics:
-
sentinel_storage_writes_total - Successful storage writes
- sentinel_storage_errors_total - Storage errors
- sentinel_cache_hits_total - Cache hits
- sentinel_cache_misses_total - Cache misses
- sentinel_cache_size - Current cache sizeAlerting Metrics:
-
sentinel_alerts_sent_total - Alerts sent by channel
- sentinel_alerts_deduplicated_total - Deduplicated alerts
- sentinel_alert_failures_total - Alert delivery failures
- sentinel_rabbitmq_publishes_total - RabbitMQ publishes
- sentinel_webhook_deliveries_total - Webhook deliveries
- sentinel_webhook_failures_total - Webhook failures$3
50+ production-ready alert rules in
deployments/prometheus/alerts/sentinel-alerts.yaml:Service Health:
- SentinelServiceDown (1m down)
- SentinelHighRestartRate (frequent restarts)
- SentinelInsufficientInstances (<2 instances)
Performance:
- SentinelHighLatency (P95 >50ms)
- SentinelVeryHighLatency (P95 >100ms)
- SentinelLowThroughput (<100 events/sec)
Errors:
- SentinelHighErrorRate (>10 errors/sec)
- SentinelIngestionErrors (>5 errors/sec)
- SentinelDetectionErrors (>5 errors/sec)
- SentinelStorageErrors (>5 errors/sec)
Resources:
- SentinelHighMemoryUsage (>1.5GB)
- SentinelCriticalMemoryUsage (>1.8GB)
- SentinelHighCPUUsage (>80%)
Anomalies:
- SentinelAnomalySpike (>50 anomalies/sec)
- SentinelCriticalAnomalies (>5 critical/sec)
- SentinelNoAnomalies (0 anomalies for 6h - detection health check)
API Reference
$3
#### Liveness Probe
`bash
GET /health/liveResponse: 200 OK
{
"status": "healthy",
"timestamp": "2024-11-06T10:30:00Z"
}
`#### Readiness Probe
`bash
GET /health/readyResponse: 200 OK (when ready to accept traffic)
Response: 503 Service Unavailable (when not ready)
`$3
#### Prometheus Metrics
`bash
GET /metricsResponse: 200 OK
HELP sentinel_events_processed_total Total events processed
TYPE sentinel_events_processed_total counter
sentinel_events_processed_total{service="chat-api"} 12453
...
`$3
#### Query Recent Telemetry
`bash
GET /api/v1/telemetry?service={service}&model={model}&hours={hours}Example:
GET /api/v1/telemetry?service=chat-api&model=gpt-4&hours=24
Response: 200 OK
{
"data": [...],
"count": 1523,
"timeRange": {
"start": "2024-11-05T10:30:00Z",
"end": "2024-11-06T10:30:00Z"
}
}
`#### Query Anomalies
`bash
GET /api/v1/anomalies?severity={severity}&hours={hours}&limit={limit}Example:
GET /api/v1/anomalies?severity=critical&hours=1&limit=50
Response: 200 OK
{
"anomalies": [
{
"id": "anom-123",
"timestamp": "2024-11-06T10:25:00Z",
"service": "chat-api",
"model": "gpt-4",
"detector": "zscore",
"metric": "latency_ms",
"value": 15234.5,
"baseline_mean": 1234.5,
"baseline_stddev": 234.2,
"z_score": 59.8,
"severity": "critical"
}
],
"count": 3
}
`#### Query Recent Anomalies
`bash
GET /api/v1/anomalies/recent?limit={limit}Example:
GET /api/v1/anomalies/recent?limit=10
Response: 200 OK (last 10 anomalies)
`#### Query Baseline Statistics
`bash
GET /api/v1/baselines?service={service}&model={model}Response: 200 OK
{
"baselines": [
{
"key": "chat-api:gpt-4:latency_ms",
"samples": 1000,
"mean": 1234.5,
"stddev": 234.2,
"median": 1205.3,
"p95": 1687.2,
"p99": 2103.4
}
]
}
`Example Producers
$3
Full-featured Python example for sending telemetry to Kafka:
`bash
cd examples/pythonInstall dependencies
pip install -r requirements.txtRun with defaults (20 normal + 5 anomalous events)
python producer.py --brokers localhost:9092Run continuously
python producer.py --continuousCustom configuration
python producer.py \
--brokers kafka-0:9092,kafka-1:9092 \
--topic llm.telemetry \
--normal-events 50 \
--anomalous-events 10
`Features:
- Configurable event generation
- Simulates 4 anomaly types (high latency, high tokens, high cost, suspicious patterns)
- Kafka integration with guaranteed delivery
- Continuous mode for load testing
$3
High-performance Go example for production use:
`bash
cd examples/goBuild
go build -o producer producer.goRun with defaults
./producer -brokers localhost:9092Run continuously
./producer -continuousCustom configuration
./producer \
-brokers kafka-0:9092,kafka-1:9092 \
-topic llm.telemetry \
-normal-events 100 \
-anomalous-events 20
`Features:
- Native Go performance (10,000+ events/sec)
- Graceful shutdown (SIGINT/SIGTERM)
- Connection pooling and retries
- Minimal memory footprint (<20MB)
$3
Integrate telemetry into your LLM application:
`python
from llm_sentinel import TelemetryProducerInitialize
producer = TelemetryProducer(
brokers=["kafka:9092"],
topic="llm.telemetry"
)After each LLM API call
event = producer.create_telemetry_event(
service_name="my-chatbot",
model_name="gpt-4",
latency_ms=response_time,
prompt_tokens=completion.usage.prompt_tokens,
completion_tokens=completion.usage.completion_tokens,
cost_usd=calculated_cost,
user_id=user.id,
session_id=session.id
)producer.send_event(event)
`Configuration
$3
`yaml
Ingestion configuration
ingestion:
kafka:
brokers:
- "kafka-0:9092"
- "kafka-1:9092"
- "kafka-2:9092"
topic: "llm.telemetry"
group_id: "sentinel-consumer"
session_timeout_ms: 6000
enable_auto_commit: false
auto_offset_reset: "latest"
max_poll_records: 500 parsing:
max_text_length: 10000
enable_sanitization: true
sanitize_patterns:
- "password"
- "api_key"
- "secret"
validation:
min_latency_ms: 0.1
max_latency_ms: 300000.0 # 5 minutes
max_tokens: 100000
max_cost_usd: 100.0
enable_pii_detection: true
Detection configuration
detection:
enabled_detectors:
- "zscore"
- "iqr"
- "mad"
- "cusum" baseline:
window_size: 1000
min_samples: 10
update_interval_secs: 60
enable_persistence: true
persistence_path: "/var/lib/sentinel/baselines"
zscore:
threshold: 3.0
sensitivity: "medium" # low, medium, high
metrics:
- "latency_ms"
- "total_tokens"
- "cost_usd"
iqr:
multiplier: 1.5
metrics:
- "latency_ms"
- "total_tokens"
mad:
threshold: 3.5
metrics:
- "latency_ms"
cusum:
threshold: 5.0
drift: 0.5
metrics:
- "latency_ms"
Storage configuration
storage:
influxdb:
url: "http://influxdb:8086"
org: "sentinel"
token: "${INFLUXDB_TOKEN}"
telemetry_bucket: "telemetry"
anomaly_bucket: "anomalies"
batch_size: 100
timeout_secs: 10 cache:
max_capacity: 10000
ttl_secs: 300
tti_secs: 60
enable_metrics: true
redis:
enabled: true
url: "redis://redis:6379"
password: "${REDIS_PASSWORD}"
key_prefix: "sentinel:"
ttl_secs: 300
Alerting configuration
alerting:
rabbitmq:
url: "amqp://rabbitmq:5672"
username: "sentinel"
password: "${RABBITMQ_PASSWORD}"
exchange: "sentinel.alerts"
exchange_type: "topic"
routing_key_prefix: "alert"
persistent: true
timeout_secs: 10
retry_config:
max_attempts: 3
initial_delay_ms: 1000
backoff_multiplier: 2.0
max_delay_ms: 30000 webhook:
enabled: false
url: "https://alerts.example.com/webhook"
method: "POST"
secret: "${WEBHOOK_SECRET}"
timeout_secs: 10
max_retries: 3
retry_delay_ms: 1000
backoff_multiplier: 2.0
deduplication:
enabled: true
window_secs: 300
cleanup_interval_secs: 60
API configuration
api:
bind_addr: "0.0.0.0:8080"
enable_cors: true
cors_origins:
- "*"
timeout_secs: 30
max_body_size: 10485760 # 10MB
enable_logging: true
metrics_path: "/metrics"
`See config/sentinel.yaml for a complete annotated example.
$3
All sensitive configuration can be provided via environment variables:
`bash
export INFLUXDB_TOKEN="your-influxdb-token"
export RABBITMQ_PASSWORD="your-rabbitmq-password"
export REDIS_PASSWORD="your-redis-password"
export WEBHOOK_SECRET="your-webhook-secret"
export SENTINEL_LOG_LEVEL="info"
export SENTINEL_LOG_JSON="true"
export RUST_BACKTRACE="1"
`Performance
$3
Tested on AWS m5.xlarge (4 vCPU, 16GB RAM):
| Operation | Throughput | P50 | P95 | P99 |
|-----------|------------|-----|-----|-----|
| Ingestion | 10,500 events/sec | 2ms | 8ms | 15ms |
| Detection | 5,200 detections/sec | 3ms | 12ms | 25ms |
| Storage (batched) | 8,300 writes/sec | 5ms | 18ms | 35ms |
| API Queries | 2,800 req/sec | 8ms | 25ms | 48ms |
$3
| Configuration | Memory | CPU | Disk I/O |
|---------------|--------|-----|----------|
| Idle | 180MB | 0.1 cores | 1 MB/s |
| Light load (1k events/sec) | 320MB | 0.8 cores | 15 MB/s |
| Medium load (5k events/sec) | 520MB | 1.8 cores | 42 MB/s |
| Heavy load (10k events/sec) | 850MB | 2.9 cores | 78 MB/s |
$3
- Horizontal: Linear scaling up to 10 replicas
- Vertical: Efficient use of multi-core (up to 8 cores)
- Storage: InfluxDB handles 100k+ writes/sec with proper sizing
- Network: ~10 Mbps at 10k events/sec (depends on event size)
Development
$3
`
llm-sentinel/
βββ crates/
β βββ sentinel-core/ # Core types and error handling (1,350 lines)
β βββ sentinel-ingestion/ # Kafka consumer and OTLP parsing (1,390 lines)
β βββ sentinel-detection/ # Anomaly detection algorithms (2,319 lines)
β βββ sentinel-storage/ # InfluxDB and caching (987 lines)
β βββ sentinel-alerting/ # RabbitMQ and webhooks (1,645 lines)
β βββ sentinel-api/ # REST API server (1,452 lines)
βββ sentinel/ # Main binary (285 lines)
βββ config/ # Configuration examples
βββ deployments/ # Deployment configurations
β βββ grafana/ # 4 Grafana dashboards
β βββ prometheus/ # 50+ alert rules
βββ docs/ # Complete documentation (27 files)
βββ examples/ # Example producers
β βββ python/ # Python producer example
β βββ go/ # Go producer example
βββ helm/ # Helm chart
β βββ sentinel/ # Production-ready chart
βββ k8s/ # Kubernetes manifests (13 files)
βββ .github/workflows/ # CI/CD pipelines
βββ Cargo.toml # Rust workspace
βββ docker-compose.yaml # Local development environment
βββ Dockerfile # Multi-stage Docker build
`Total: ~9,500 lines of production Rust code
$3
`bash
All tests
cargo test --workspaceSpecific crate
cargo test -p sentinel-detectionWith output
cargo test -- --nocaptureIntegration tests only
cargo test --workspace --test '*'With coverage
cargo tarpaulin --out Html --output-dir coverage
`$3
`bash
Debug build
cargo buildRelease build (optimized)
cargo build --releaseWith specific features
cargo build --features redisDocker build
docker build -t llm-sentinel:latest .Multi-platform Docker build
docker buildx build --platform linux/amd64,linux/arm64 -t llm-sentinel:latest .
`$3
`bash
Format code
cargo fmtLint with clippy
cargo clippy -- -D warningsSecurity audit
cargo auditCheck dependencies
cargo deny check
`CI/CD Pipeline
Full GitHub Actions workflow:
$3
- Format check (rustfmt)
- Lint (clippy with all warnings as errors)
- Build (debug and release)
- Test (all workspaces)
- Coverage (tarpaulin with 80% target)
- Security audit (cargo-audit)
- Dependency check (cargo-deny)
$3
- Docker build (multi-stage)
- SBOM generation (syft)
- Image signing (cosign)
- Vulnerability scan (trivy)
- Push to registry (GHCR)
- Helm package (chart packaging)
- Deploy (to staging/production)
Contributing
We welcome contributions! Please see CONTRIBUTING.md for:
- Code of Conduct
- Development setup
- Coding standards
- Testing requirements
- Pull request process
- Security reporting
Documentation
Complete documentation available in
/docs`:- Architecture Overview
- Detection Methods
- Deployment Guide
- Integration Examples
- Performance Benchmarks
- Configuration Reference
This project is licensed under the Apache License 2.0 - see LICENSE file for details.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Complete docs
- Security: security@llm-devops.io
- [ ] Additional detection algorithms (Isolation Forest, AutoEncoder)
- [ ] Real-time dashboard (WebSocket streaming)
- [ ] Multi-tenant support
- [ ] Advanced query DSL
- [ ] Machine learning model drift detection
- [ ] Automated baseline tuning
- [ ] Support for additional message brokers (NATS, Pulsar)
- [ ] OpenTelemetry native ingestion
- [ ] Grafana plugin for custom visualizations
---
Status: β
Production Ready
Version: 0.1.0
License: Apache 2.0
Built with: Rust π¦
Last Updated: 2025-11-06