Safe Coder CLI

Standalone CLI tool for documentation crawling with SPA support, error detection, and code validation.

Overview

@gulibs/safe-coder-cli is an independent command-line tool that crawls documentation websites and generates structured output. It supports both static sites and Single Page Applications (SPAs) using browser automation.

This CLI is designed to work standalone or as part of the Safe Coder ecosystem, where it's called by the @gulibs/safe-coder MCP Server.

Features

- HTTP & Browser Crawling: Supports both static HTTP crawling and browser-based rendering for SPAs
- Intelligent Content Extraction: Cleans and structures documentation content
- Parallel Processing: Multi-worker support for faster crawling
- Progress Reporting: Real-time progress updates via stderr
- JSON Output: Machine-readable JSON output for programmatic use
- Skill Generation: Generates AI-ready SKILL files from documentation
- Checkpoint Support: Resume interrupted crawls
- Proxy Support: Configure HTTP/HTTPS proxies

Installation

$3

``bash npm install -g @gulibs/safe-coder-cli`

Or using yarn:

`bash yarn global add @gulibs/safe-coder-cli`

Or using pnpm:

`bash pnpm add -g @gulibs/safe-coder-cli`

`$3`

`bash safe-coder-cli --version safe-coder-cli --help`

`Usage`

`$3`

`bash safe-coder-cli crawl https://react.dev`

`$3`

`bash

`Limit pages and depth`


safe-coder-cli crawl https://react.dev --max-pages 50 --max-depth 3
Use multiple workers for faster crawling

safe-coder-cli crawl https://react.dev --workers 5
Force browser automation for SPAs

safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright
Save output to directory

safe-coder-cli crawl https://react.dev --output-dir ./skills

$3

`bash

`Output machine-readable JSON`


safe-coder-cli crawl https://react.dev --output-format json
Capture output to file

safe-coder-cli crawl https://react.dev --output-format json > output.json


Command Reference
$3
Crawl documentation website and optionally generate skill file.
#### Options

- -c, --config - Path to configuration file --b, --browser - Browser type: puppeteer | playwright--d, --max-depth - Maximum crawl depth (default: 3) --p, --max-pages - Maximum number of pages to crawl (default: 50) --w, --workers - Number of parallel workers (default: 1) ---spa-strategy - SPA strategy: smart | auto | manual(default: smart) --o, --output-dir - Output directory for skill files --f, --filename - Skill name for directory and file names ---checkpoint- Enable checkpoint/resume functionality ---resume- Resume from last checkpoint if available ---rate-limit - Delay in milliseconds between requests (default: 500) ---output-format - Output format: json | pretty(default: pretty) ---include-paths - Additional path patterns to include (comma-separated) ---exclude-paths - Path patterns to exclude (comma-separated)

`$3`

Detect errors and warnings in code files.

`bash safe-coder-cli detect-errors ./src/app.ts safe-coder-cli detect-errors ./src/app.ts --format json`

`$3`

Validate and optionally fix code errors.

`bash safe-coder-cli validate-code ./src/app.ts safe-coder-cli validate-code ./src/app.ts --output ./src/app.fixed.ts`

`Configuration File`

Create a .doc-crawler.json file in your project root:

`json { "browser": "puppeteer", "spaStrategy": "smart", "crawl": { "maxDepth": 3, "maxPages": 200, "workers": 5, "rateLimit": 300, "checkpoint": { "enabled": true, "interval": 50 } }, "proxy": "http://127.0.0.1:7890" }`

`Output Format`

`$3`

When using --output-format json, the CLI outputs:

`json { "success": true, "data": { "source": { "url": "https://react.dev", "crawledAt": "2024-01-15T10:30:00.000Z", "pageCount": 50, "depth": 3 }, "pages": [ { "url": "https://react.dev/learn", "title": "Learn React", "content": "...", "wordCount": 1500, "codeBlocks": 5, "headings": ["Getting Started", "Components"] } ], "metadata": { "technology": "react.dev", "categories": ["tutorial", "api", "guide"] }, "statistics": { "totalPages": 50, "maxDepthReached": 3, "errors": 0 }, "skill": { "skillMd": "...", "quality": 85 } } }`

`$3`

Progress information is output to stderr in JSON format:

`json {"type":"progress","message":"Crawled 10/50 pages","timestamp":"...","current":10,"total":50,"percentage":20}`

`Browser Setup`

For SPA crawling, you need Chrome/Chromium installed:

`$3`

`bash brew install --cask google-chrome`

`$3`

`bash winget install Google.Chrome`

`$3`

`bash sudo apt install google-chrome-stable`

`$3`

`bash export CHROME_PATH=/path/to/chrome`

`Environment Variables`

- CHROME_PATH- Path to Chrome executable -HTTP_PROXY- HTTP proxy URL -HTTPS_PROXY- HTTPS proxy URL -LOG_LEVEL - Log level (INFO, DEBUG, ERROR)

`Integration with MCP Server`

The CLI is designed to be called by @gulibs/safe-coder MCP Server. The MCP Server:

1. Checks if CLI is installed 2. Spawns CLI with appropriate parameters 3. Monitors progress via stderr 4. Parses JSON output from stdout 5. Post-processes results and generates SKILL guidance

`Examples`

`$3`

`bash safe-coder-cli crawl https://docs.example.com --max-pages 30`

`$3`

`bash safe-coder-cli crawl https://docs.example.com --workers 8 --max-pages 200`

`$3`

`bash safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright`

`$3`

`bash safe-coder-cli crawl https://react.dev \ --output-dir ~/.cursor/skills \ --filename react-docs \ --max-pages 100`

`$3`

`bash safe-coder-cli crawl https://docs.example.com \ --output-format json \ --max-pages 20 > output.json

`Process with jq`


cat output.json | jq '.data.statistics'


Troubleshooting
$3

After installation, if safe-coder-cli is not found:

`bash

`Check npm global bin path`


npm config get prefix
Add to PATH if needed (macOS/Linux)

export PATH="$(npm config get prefix)/bin:$PATH"


$3
If you see "Chrome/Chromium not found":

1. Install Chrome (see Browser Setup above) 2. SetCHROME_PATHenvironment variable 3. Or install full puppeteer:npm install -g puppeteer

`$3`

On Linux/macOS, you may need sudo for global installation:

`bash sudo npm install -g @gulibs/safe-coder-cli`

Or use a version manager like nvm to avoid sudo.

`Development`

`bash

`Clone repository`


git clone 
cd safe-coder-cli
Install dependencies

npm install
Build

npm run build
Link for local testing

npm link
Test

safe-coder-cli --version


License
MIT
Related Projects

- @gulibs/safe-coder` - MCP Server that orchestrates this CLI

Safe Coder CLI

Standalone CLI tool for documentation crawling with SPA support, error detection, and code validation.

Overview

This CLI is designed to work standalone or as part of the Safe Coder ecosystem, where it's called by the @gulibs/safe-coder MCP Server.

Features

Installation

$3

``bash npm install -g @gulibs/safe-coder-cli`

Or using yarn:

`bash yarn global add @gulibs/safe-coder-cli`

Or using pnpm:

`bash pnpm add -g @gulibs/safe-coder-cli`

`$3`

`bash safe-coder-cli --version safe-coder-cli --help`

`Usage`

`$3`

`bash safe-coder-cli crawl https://react.dev`

`$3`

`bash

`Limit pages and depth`


safe-coder-cli crawl https://react.dev --max-pages 50 --max-depth 3
Use multiple workers for faster crawling

safe-coder-cli crawl https://react.dev --workers 5
Force browser automation for SPAs

safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright
Save output to directory

safe-coder-cli crawl https://react.dev --output-dir ./skills

$3

`bash

`Output machine-readable JSON`


safe-coder-cli crawl https://react.dev --output-format json
Capture output to file

safe-coder-cli crawl https://react.dev --output-format json > output.json


Command Reference
$3
Crawl documentation website and optionally generate skill file.
#### Options

`$3`

Detect errors and warnings in code files.

`bash safe-coder-cli detect-errors ./src/app.ts safe-coder-cli detect-errors ./src/app.ts --format json`

`$3`

Validate and optionally fix code errors.

`bash safe-coder-cli validate-code ./src/app.ts safe-coder-cli validate-code ./src/app.ts --output ./src/app.fixed.ts`

`Configuration File`

Create a .doc-crawler.json file in your project root:

`Output Format`

`$3`

When using --output-format json, the CLI outputs:

`$3`

Progress information is output to stderr in JSON format:

`json {"type":"progress","message":"Crawled 10/50 pages","timestamp":"...","current":10,"total":50,"percentage":20}`

`Browser Setup`

For SPA crawling, you need Chrome/Chromium installed:

`$3`

`bash brew install --cask google-chrome`

`$3`

`bash winget install Google.Chrome`

`$3`

`bash sudo apt install google-chrome-stable`

`$3`

`bash export CHROME_PATH=/path/to/chrome`

`Environment Variables`

- CHROME_PATH- Path to Chrome executable -HTTP_PROXY- HTTP proxy URL -HTTPS_PROXY- HTTPS proxy URL -LOG_LEVEL - Log level (INFO, DEBUG, ERROR)

`Integration with MCP Server`

The CLI is designed to be called by @gulibs/safe-coder MCP Server. The MCP Server:

1. Checks if CLI is installed 2. Spawns CLI with appropriate parameters 3. Monitors progress via stderr 4. Parses JSON output from stdout 5. Post-processes results and generates SKILL guidance

`Examples`

`$3`

`bash safe-coder-cli crawl https://docs.example.com --max-pages 30`

`$3`

`bash safe-coder-cli crawl https://docs.example.com --workers 8 --max-pages 200`

`$3`

`bash safe-coder-cli crawl https://spa-site.com --spa-strategy auto --browser playwright`

`$3`

`bash safe-coder-cli crawl https://react.dev \ --output-dir ~/.cursor/skills \ --filename react-docs \ --max-pages 100`

`$3`

`bash safe-coder-cli crawl https://docs.example.com \ --output-format json \ --max-pages 20 > output.json

`Process with jq`


cat output.json | jq '.data.statistics'


Troubleshooting
$3

After installation, if safe-coder-cli is not found:

`bash

`Check npm global bin path`


npm config get prefix
Add to PATH if needed (macOS/Linux)

export PATH="$(npm config get prefix)/bin:$PATH"


$3
If you see "Chrome/Chromium not found":

1. Install Chrome (see Browser Setup above) 2. SetCHROME_PATHenvironment variable 3. Or install full puppeteer:npm install -g puppeteer

`$3`

On Linux/macOS, you may need sudo for global installation:

`bash sudo npm install -g @gulibs/safe-coder-cli`

Or use a version manager like nvm to avoid sudo.

`Development`

`bash

`Clone repository`


git clone 
cd safe-coder-cli
Install dependencies

npm install
Build

npm run build
Link for local testing

npm link
Test

safe-coder-cli --version


License
MIT
Related Projects

- @gulibs/safe-coder` - MCP Server that orchestrates this CLI