Real-time malware scanner for npm packages
npm install npm-malware-scanner


Real-time malware scanner for npm packages. Detects install scripts, shell access, obfuscated code, network access, filesystem access, and typosquatting attacks.
How many hours did you spend? Roughly 5 hours.
Did you have adequate time to work on the code submission? For an alpha version, I think so.
Did you use any AI coding tools to assist with coding? Yes, ChatGPT and Claude.
Did you leverage external resources? Yes. This project was built using industry best practices and research from security experts. Google, StackOverflow, and reference documentation were also used for research.
Supply Chain Security:
- Socket.dev Documentation - Alert types and detection strategies
- npm Security Best Practices - Understanding npm security model
- OWASP Top 10 for CI/CD - CI/CD security risks
Static Analysis Techniques:
- Babel Parser Documentation - AST parsing for JavaScript/TypeScript
- ESLint Source Code - Pattern matching and code analysis techniques
- Shannon Entropy) - Obfuscation detection using information theory
Typosquatting Research:
- Levenshtein Distance Algorithm - String similarity measurement
- Typosquatting on PyPI - Academic research on package name attacks
- npm Typosquatting Attacks - Real-world examples
npm Registry APIs:
- npm Registry API - Package metadata and download
- CouchDB Changes Feed - Real-time monitoring
Notable CVEs & Attacks:
- CVE-2021-44906 - Minimist prototype pollution
- event-stream incident - Malicious dependency injection
- ua-parser-js attack - Cryptocurrency miner in popular package
Why These Resources?
- Socket.dev - Understand the product we're building towards
- Academic papers - Proven algorithms for typosquat detection
- Real CVEs - Learn from actual attacks to build better detectors
- npm APIs - Official documentation for reliable integration
- Open source projects - Learn from battle-tested implementations (ESLint, Babel)
``bashGlobal installation
npm install -g npm-malware-scanner
Usage
$3
`bash
npm-scanner Examples
npm-scanner express 4.18.2
npm-scanner axios 1.6.0
`$3
Monitor the npm registry feed in real-time:
`bash
npm-scanner --live
`$3
The scanner automatically detects CI/CD environments and adapts output format.
GitHub Actions:
`yaml
- name: Security Scan
run: npm-scanner express 4.18.2
`Other CI Systems:
`bash
CI=true npm-scanner express 4.18.2
`See CI-CD-INTEGRATION.md for detailed integration guides.
Detection Capabilities
$3
Identifies packages with lifecycle scripts that execute arbitrary code:
- \preinstall\, \install\, \postinstall\
- \preuninstall\, \uninstall\, \postuninstall\Severity: High
$3
Detects packages making network requests:
- Node.js modules: \http\, \https\, \net\, \dgram\, \dns\
- Browser APIs: \fetch\, \XMLHttpRequest\, \WebSocket\, \EventSource\
- Popular libraries: \axios\, \node-fetch\, \got\, \superagent\, \request\Severity: Medium
$3
Identifies packages with names similar to popular packages using Levenshtein distance.Severity: High
Architecture
`
src/
├── cli.ts # CLI entry point
├── scanner.ts # Scan orchestration
├── types.ts # TypeScript interfaces
├── detectors/
│ ├── install-scripts.ts # Lifecycle script detection
│ ├── network-access.ts # Network access detection (AST + regex)
│ └── typosquat.ts # Typosquat detection
├── npm/
│ ├── registry.ts # Package fetching & extraction
│ └── feed.ts # Live feed monitoring
└── utils/
├── logger.ts # Output formatting
└── environment.ts # CI/CD detection
`Design Decisions
$3
Choice: Analyze code without execution
Rationale: Safe, fast (~500ms per package), effective for most threats
Tradeoff: Cannot detect runtime behavior or heavily obfuscated code$3
Choice: Combine AST parsing with regex patterns
Rationale: AST for accuracy, regex for obfuscated/dynamic code
Tradeoff: Slightly slower but more comprehensive$3
Choice: Compare only against top npm packages
Rationale: Fast, practical, low false positives
Tradeoff: Misses typosquats of less popular packagesExtending the Scanner
$3
Create a detector file:
`typescript
// src/detectors/my-detector.ts
import { Alert, DetectorResult } from '../types';export class MyDetector {
static async detect(packagePath: string): Promise {
const alerts: Alert[] = [];
// Your detection logic
return { alerts };
}
}
`Register in \
src/scanner.ts\:`typescript
import { MyDetector } from './detectors/my-detector';const [installScriptResult, networkAccessResult, typosquatResult, myResult] =
await Promise.all([
InstallScriptDetector.detect(packageInfo.extractedPath),
NetworkAccessDetector.detect(packageInfo.extractedPath),
TyposquatDetector.detect(packageName),
MyDetector.detect(packageInfo.extractedPath), // Add here
]);
alerts.push(...myResult.alerts);
`$3
`bash
Clone and setup
git clone https://github.com/socket-security/npm-scanner
cd npm-scanner
pnpm installBuild
pnpm buildRun tests
pnpm testTest with coverage
pnpm test:coverageTest a package
pnpm start express 4.18.2Test in CI mode
CI=true pnpm start express 4.18.2
`$3
The project includes comprehensive unit tests for all detectors:
`bash
Run all tests
pnpm testWatch mode
pnpm test:watchCoverage report
pnpm test:coverage
``Test Coverage:
- Install script detection
- Shell access detection (child_process, exec, spawn)
- Obfuscation detection (entropy analysis)
- Network access detection (http, fetch, axios, etc.)
- Filesystem access detection (fs module operations)
- Typosquat detection (Levenshtein distance)
- Edge cases and error handling
- Static analysis only - Cannot detect runtime behavior
- No dependency scanning - Only scans the target package
- Obfuscation - Heavily obfuscated code may evade detection
- False positives - Legitimate packages may trigger alerts (e.g., HTTP clients)
- Single package scan: 500ms - 2s
- Network detection: 100-500ms
- Typosquat check: ~50ms
- Live mode throughput: 1-2 packages/second
Contributions welcome! Areas of interest:
- New detectors (shell access, crypto mining, data exfiltration)
- Performance improvements
- Better obfuscation detection
- Additional CI/CD integrations
MIT