High-performance SEO analysis CLI with 220+ checks, Rust-powered duplicate detection, worker thread parallelization, and clean progress UI. Analyzes 1000+ page sites in under 2 minutes. Achieves ~85% Screaming Frog parity.
npm install seo-reporterA powerful TypeScript CLI tool that crawls websites, extracts SEO metadata, detects common issues, and generates comprehensive HTML reports. Think of it as a free, open-source alternative to Screaming Frog's core SEO analysis features.
``bashInstall dependencies
pnpm install
Features
$3
- π·οΈ Website Crawling: Breadth-first traversal with configurable depth and concurrency
- π 220+ SEO Checks: Implements 85-90% parity with Screaming Frog's core analysis
- β‘ Fast & Efficient: Concurrent crawling with configurable request limits and memory-efficient processing
- π Interactive HTML Reports: Beautiful, filterable reports with severity-based issue categorization (now with working filters, sortable columns, default severity sort, live filtered totals, a new Issues-by-Type view with drill-down, and a reorganized nav with a Content dropdown)
- π‘ Actionable Tooltips: Hover over any issue to see specific fix recommendations (25+ guidance tips). Improved (i) tooltips now render consistently across the report.
- π± Mobile Responsive: All reports fully responsive with touch-friendly interfaces (44px tap targets)
- π₯οΈ Built-in Server: Zero-dependency static file server using Node.js built-ins (seo-reporter serve)
- π JSON API Routes: Complete RESTful-style JSON API for all SEO data - perfect for integrations and custom dashboards
- π§ Clickable Site Structure: Navigate from the site structure tree directly to per-page details
- π€ CSV Export: Export all data to Excel-compatible CSV files for further analysis
- π€ Robots.txt Integration: Automatic robots.txt parsing and compliance
- πΊοΈ Sitemap Analysis: Auto-detects and analyzes XML sitemaps (runs by default)$3
- Comprehensive On-Page Data: Titles, descriptions, headings, canonical URLs, robots directives
- Social Media Tags: Open Graph, Twitter Cards
- Internationalization: hreflang attributes with validation
- Structured Data: JSON-LD & microdata extraction
- Links: Internal/external links with anchor text, nofollow detection
- Images: Alt text, dimensions tracking
- Content Metrics: Word count, text length, HTML size, content-to-code ratio
- Performance: Response times, redirect chains
- Scripts: External JavaScript detection with async/defer tracking$3
- Protocol Security: HTTP vs HTTPS detection
- Mixed Content: HTTP resources on HTTPS pages
- Insecure Forms: Form actions over HTTP
- Security Headers: HSTS, CSP, X-Frame-Options, X-Content-Type-Options, Referrer-Policy
- Protocol-Relative URLs: Detection of // URLs$3
- URL Issues: Multiple slashes, spaces, non-ASCII characters, uppercase letters
- URL Structure: Repetitive paths, overly long URLs (>2083 chars)
- Parameters: Query params, tracking params, internal search URLs
- Fragment URLs: Detection of fragment-only links$3
- Duplicate Detection:
- Exact duplicates (SHA-256 hash-based)
- Near duplicates (>90% similarity using MinHash)
- Duplicate H1/H2 across pages
- Content Analysis:
- Lorem ipsum detection
- Soft 404 detection
- Readability metrics (Flesch-Kincaid Grade, Reading Ease, ARI)
- Poor readability warnings (>12th grade level)
- Thin content detection (<300 words)$3
- Titles: Missing, duplicate, too long/short (characters + pixel width), multiple tags, outside , identical to H1
- Meta Descriptions: Missing, duplicate, too long/short (characters + pixel width), multiple tags, outside
- Headings: Missing H1, multiple H1, broken hierarchy, overly long (>70 chars), empty headings
- Canonical: Multiple/conflicting tags, relative URLs, fragments, outside , invalid attributes
- Robots: Conflicting directives, noindex/nofollow detection
- Indexability Tracking: Comprehensive analysis of why pages are/aren't indexable
- Non-200 status codes (404, 500, etc.)
- noindex in meta robots tag
- noindex in X-Robots-Tag header
- Canonical pointing to different URL
- Detailed reasons shown in Issues tab and individual page reports
- Pagination: rel="next"/rel="prev" validation, multiple pagination links, sequence errors$3
- Anti-Detection Crawling: Bypass basic bot detection systems with realistic browser simulation
- User Agent Rotation: 20+ realistic user agents from Chrome, Firefox, Safari, Edge across Windows, macOS, and Linux
- Header Randomization: Dynamic browser headers with realistic patterns and Chrome sec-ch-ua headers
- Human-Like Timing: Intelligent delays (1-8 seconds) simulating quick, normal, and slow browsing patterns
- Proxy Support: Rotate through multiple proxy servers with automatic failover and validation
- Session Management: Maintain consistent headers across requests for realistic browsing simulation
- Custom Configuration: Define your own user agents, proxies, and timing patterns
- Seamless Integration: Works with all existing crawl modes and analysis features$3
- 404 Tracking:
- Dedicated 404 Pages tab with referrer tracking
- Shows which pages link to each 404 (now normalized so /path and /path/ are treated the same)
- Helps identify and fix broken internal links
- Link Quality:
- Orphan pages (no internal inlinks)
- Dead ends (no outlinks)
- Weak anchor text ("click here", empty, too short)
- Localhost links (127.0.0.1)
- Missing protocol on external links (e.g., facebook.com without https://)
- Link Metrics:
- Internal vs external link counts
- High outlink warnings (>100 internal, >50 external)
- Inlink count per page
- Crawl depth distribution$3
- Document Structure: Missing/multiple or tags
- Element Positioning: Tags outside that should be inside
- Document Order: Incorrect / ordering
- Size & Complexity: Large HTML (>1MB), excessive DOM depth (>30 levels)
- Invalid Elements: Elements that shouldn't be in $3
- π΄ High Severity: Missing titles/H1s, HTTP pages, mixed content, insecure forms, soft 404s, lorem ipsum, malformed HTML
- π‘ Medium Severity: Title/description length issues, multiple H1s, images without alt, thin content, slow pages, security headers missing
- π΅ Low Severity: Heading hierarchy, redirect chains, URL quality issues, readability warnings, informational noticesPerformance β‘
SEO Reporter includes a Rust-powered native module for near-duplicate content detection, providing massive performance gains for large sites:
$3
| Pages | TypeScript (O(nΒ²)) | Rust + LSH (O(n)) | Speedup |
|-------|-------------------|-------------------|---------|
| 100 | ~10s | ~0.1s | 100x |
| 500 | ~2.5min | ~0.5s | 300x |
| 1000 | ~10min | ~1s | 600x |
| 5000 | ~4 hours | ~5s | ~3000x |
$3
The Rust module uses Locality-Sensitive Hashing (LSH) with MinHash signatures:
- Generates 128-hash MinHash signatures for each page
- Groups pages into buckets using 16 bands Γ 8 rows
- Only compares pages that share at least one bucket (candidates)
- Reduces comparisons from O(nΒ²) to O(n) with ~95% accuracy
$3
If the Rust module fails to load (unsupported platform or not built), the tool automatically falls back to the pure TypeScript implementation, ensuring compatibility on all platforms.
β οΈ Rust Warning: When the Rust module is unavailable, the CLI displays a clear warning:
`
β οΈ Rust native module not available - using TypeScript fallback for near-duplicate detection
Note: Near-duplicate detection will be slower. Run npm rebuild to build the Rust module.
`This helps users understand why near-duplicate detection may be slower and provides actionable instructions to enable the faster implementation.
$3
Pre-built Rust binaries are included for:
- macOS (Intel x64 & Apple Silicon ARM64)
- Linux (x64 & ARM64, glibc & musl)
- Windows (x64)
Installation
$3
- Node.js 18 or higher
- pnpm (recommended) or npm
- Rust 1.70+ (optional, only needed if building from source; pre-built binaries included)
$3
`bash
pnpm install
`$3
For maximum performance, install Rust to enable the native module:
`bash
Automatic Rust installation (Windows, macOS, Linux)
pnpm setup:rust
`This will:
- Download and install rustup (Rust toolchain installer)
- Install the latest stable Rust toolchain
- Set up the environment for building the native module
- Verify the installation
$3
`bash
pnpm build
`$3
The Rust module provides 100-1000x faster near-duplicate detection. The build process automatically handles Rust environment setup:
`bash
Install Rust automatically (Windows, macOS, Linux)
pnpm setup:rustFull build (Rust + TypeScript)
pnpm buildRust module only
pnpm build:rust-onlyTypeScript only (fallback if Rust unavailable)
pnpm build:ts-only
`Note: The build scripts automatically source the Rust environment (
$HOME/.cargo/env) if available. Pre-built binaries are included for most platforms, but you can rebuild if needed.$3
`bash
Development mode (no build required)
pnpm dev --url https://example.comProduction mode (requires build)
pnpm start --url https://example.com
`$3
`bash
pnpm install -g .
seo-reporter --url https://example.com
`Usage
$3
`bash
Crawl and analyze a website
seo-reporter crawl --url https://example.comOr use the legacy format (still supported)
seo-reporter --url https://example.comStart the report server (no URL needed)
seo-reporter serve
`$3
`bash
Crawl command
seo-reporter crawl \
--url https://example.com \ # Required: Target URL to crawl
--depth 3 \ # Optional: Max crawl depth (default: 3)
--max-pages 1000 \ # Optional: Max pages to crawl (default: 1000)
--concurrency 10 \ # Optional: Concurrent requests (default: 10)
--output ./seo-report \ # Optional: Output directory (default: ./seo-report)
--timeout 10000 \ # Optional: Request timeout in ms (default: 10000)
--user-agent "CustomBot/1.0" \ # Optional: Custom user agent
--export-csv \ # Optional: Export results to CSV files
--respect-robots \ # Respect robots.txt (default: true)
--ignore-robots \ # Ignore robots.txt rules
--crawl-mode both \ # Optional: Crawl mode - crawl|sitemap|both (default: both)
--sitemap-url https://example.com/sitemap.xml \ # Custom sitemap URL
--validate-schema \ # Validate JSON-LD schema.org data
--stealth \ # Enable stealth mode with randomized headers and timing
--stealth-user-agents "Agent1,Agent2" \ # Custom user agents for stealth mode
--stealth-min-delay 1000 \ # Minimum delay between requests in stealth mode (ms)
--stealth-max-delay 5000 \ # Maximum delay between requests in stealth mode (ms)
--stealth-proxies "proxy1:8080,proxy2:3128" # Proxy rotation for stealth modeServe command
seo-reporter serve \
--port 8080 \ # Optional: Port to listen on (default: 8080)
./seo-report # Optional: Directory to serve (default: ./seo-report)
`$3
The
--crawl-mode option controls how the tool discovers pages:-
crawl - Follow links only (traditional crawling)
- sitemap - Crawl only URLs found in sitemap(s)
- both - Crawl sitemap URLs + follow links (default, discovers maximum pages)Example:
`bash
Only crawl URLs from sitemap
seo-reporter --url https://example.com --crawl-mode sitemapTraditional link-based crawling only
seo-reporter --url https://example.com --crawl-mode crawlBoth (default)
seo-reporter --url https://example.com --crawl-mode both
`$3
After generating a report, you can view it in two ways:
Option 1: Open directly in browser
`bash
open seo-report/index.html
`Option 2: Start a local server (Recommended)
`bash
Using the built-in server
seo-reporter serve seo-reportOr with a custom port
seo-reporter serve seo-report --port 3000Using npm script
pnpm serve # Serves ./seo-report on port 8080
`The built-in server uses Node.js's native
http module (zero dependencies, works everywhere).$3
When running the tool, you'll see detailed progress for each phase:
`bash
$ seo-reporter --url https://example.com --max-pages 100π SEO Reporter
Configuration:
URL: https://example.com/
Max Depth: 3
Max Pages: 100
Concurrency: 10
Output: ./seo-report
β Ή Crawling website... π’ 25/100 pages
β Έ Crawling website... π’ 50/100 pages
β Ό Crawling website... π’ 100/100 pages
β Crawled 100 pages in 15.2s
β Ή Parsing SEO metadata... 25/100 pages
β Έ Parsing SEO metadata... 50/100 pages
β Ό Parsing SEO metadata... 100/100 pages
β Parsed metadata from 100 pages in 3.4s
β Ή Analyzing... Per-page analysis (25/100)
β Έ Analyzing... Per-page analysis (50/100)
β Ό Analyzing... Per-page analysis (100/100)
β ΄ Analyzing... Link quality analysis
β ¦ Analyzing... Content quality checks
β § Analyzing... Finding duplicate titles/descriptions
β Analyzing... Finding exact duplicate content
β Analyzing... Finding near-duplicate content (Rust + LSH)
β Analyzing... Finding duplicate headings
β Analysis complete in 2.1s
β Sitemap analyzed (95 URLs in sitemap)
π Issues Found:
β οΈ 5 pages with missing meta descriptions
β οΈ 3 pages with duplicate titles
...
`Note: The progress counters (e.g.,
25/100) show real-time progress during crawling, parsing, and analysis phases, making it easy to estimate remaining time.$3
Crawl a small site with depth 2:
`bash
seo-reporter --url https://myblog.com --depth 2 --max-pages 100
`Fast crawl with high concurrency:
`bash
seo-reporter --url https://example.com --concurrency 20 --depth 2
`Crawl and save to custom directory:
`bash
seo-reporter --url https://example.com --output ./reports/example-audit
`Crawl with CSV export:
`bash
seo-reporter --url https://example.com --export-csv
`Stealth mode crawling:
`bash
Basic stealth mode
seo-reporter --url https://example.com --stealthStealth with custom timing
seo-reporter --url https://example.com --stealth --stealth-min-delay 2000 --stealth-max-delay 8000Stealth with custom user agents and proxies
seo-reporter --url https://example.com --stealth \
--stealth-user-agents "Mozilla/5.0 (Custom Bot),Another Custom Agent" \
--stealth-proxies "proxy1.example.com:8080,proxy2.example.com:3128"
`SEO Issues Detected
The tool checks for the following SEO issues:
$3
- β Missing Title Tags: Pages without a tag
- β Broken Links: Pages returning 404 status codes
- β Conflicting Robots Directives: Multiple robots tags with contradictory instructions (e.g., "index" and "noindex")
- β Multiple Canonical Tags: Pages with conflicting canonical URLs
- β Malformed JSON-LD: Structured data scripts with JSON parsing errors$3
- β οΈ Missing Meta Descriptions: Pages without meta description tags
- β οΈ Duplicate Titles: Multiple pages sharing the same title text
- β οΈ Duplicate Descriptions: Multiple pages sharing the same meta description
- β οΈ Title Too Long: Titles over 60 characters (may be truncated in search results)
- β οΈ Title Too Short: Titles under 20 characters (may not be descriptive enough)
- β οΈ Description Too Long: Meta descriptions over 160 characters (may be truncated)
- β οΈ Description Too Short: Meta descriptions under 50 characters (may not be informative enough)
- β οΈ Missing H1 Tags: Pages without an H1 heading
- β οΈ Multiple H1 Tags: Pages with more than one H1 heading
- β οΈ Improper Heading Hierarchy: Heading levels that skip numbers (e.g., H1 to H3)
- β οΈ Images Without Alt Text: Images missing accessibility alt attributes
- β οΈ Thin Content: Pages with less than 300 words
- β οΈ Slow Page Load: Pages with response times over 3 seconds$3
- βΉοΈ Noindex Pages: Pages set to noindex (verify if intentional)
- βΉοΈ Redirect Chains: Pages with redirect chains detected
- βΉοΈ Multiple Title/Description Tags: Single page with duplicate meta tags$3
- All headings (H1-H6) extracted and analyzed for proper hierarchy
- Internal vs external link analysis
- Image alt text coverage
- Word count and content density metrics
- Open Graph and Twitter Card metadata presence
- hreflang implementation
- JSON-LD and microdata structured data detectionOutput Reports
The tool generates comprehensive reports in the specified output directory:
$3
- index.html: Interactive summary report with:
- Tabbed Interface: Overview, All Pages, Site Structure, Links, Content, Performance, Scripts, Sitemap, Issues, and API tabs
- Sortable Tables: Click column headers to sort data ascending/descending
- Filterable Content: Search boxes to quickly find specific pages, links, or issues
- Visual Statistics: Color-coded cards showing issue counts and severity
- All Data: Links analysis (internal/external), headings, images, performance metrics
- page-viewer.html: Dynamic page detail viewer that loads data from JSON files
- Displays complete page metadata, issues, headings, links, and images
- Loads data on-demand from JSON API routes
- Accessed via page-viewer.html?url=Reports are fully self-contained with inline CSS and JavaScript - no external dependencies.
$3
All SEO data is available as JSON files for programmatic access, integrations, or custom dashboards:
#### Individual Page Data
-
json/pages/{filename}.json: Complete page metadata including:
- Title, meta description, canonical URL, robots directives
- All headings (H1-H6), links, and images
- Content metrics (word count, HTML size, readability scores)
- Performance metrics (response time, redirects)
- Security analysis (HTTPS, headers, mixed content)
- URL quality metrics
- Structured data (JSON-LD, microdata)
- All detected issues with severity levels
- json/issues/{filename}.json: Page-specific issues with severity counts#### Aggregate Data Endpoints
-
json/all-pages.json: Summary of all pages with key metrics
- json/all-issues.json: All issues across all pages
- json/issues-summary.json: Issues statistics by severity and type
- json/links.json: All internal and external links
- json/images.json: All images with alt text status
- json/headings.json: All headings with levels
- json/performance.json: Performance metrics for all pages
- json/external-scripts.json: External JavaScript usage analysis
- json/404-pages.json: 404 pages with referrer tracking
- json/sitemap-info.json: Sitemap analysis data
- json/site-structure.json: Site structure tree
- json/url-index.json: URL to filename mapping for easy lookups#### Using the JSON API
`bash
Generate report
seo-reporter --url https://example.comAccess JSON data programmatically
curl http://localhost:8000/seo-report/json/all-issues.json
curl http://localhost:8000/seo-report/json/pages/index.json
curl http://localhost:8000/seo-report/json/issues-summary.jsonOr use in your application
fetch('./seo-report/json/all-pages.json')
.then(res => res.json())
.then(data => console.log(data.pages));
`#### In-Report API Tab
Open the API tab in
index.html for an at-a-glance list of endpoints, example curl/JS usage, and tips on mapping URLs to filenames via json/url-index.json. The tab now reliably renders with the updated tab switching logic.The JSON API is perfect for:
- CI/CD pipeline integrations
- Custom dashboards and visualizations
- Automated monitoring and alerts
- Data analysis and reporting scripts
- Integration with other SEO tools
$3
- Same-domain redirects are followed and analyzed. After redirects within the same domain, links are resolved against the final URL to ensure correct internal/external classification.
- Cross-domain redirects are not analyzed or crawled. The redirect chain is recorded for the original URL, but the destination pageβs content and links are not fetched or followed.
#### Large Site Performance (10k+ pages)
- For large datasets, reports now use chunked JSONP files and a small client runtime to progressively render big tables.
- Tables support pagination, sorting, filtering, and a page-size selector (25/50/100/250/500).
- Data files are written to
seo-report/data/β¦/*.js and work offline via file:// (no fetch).
- Sorting or filtering may trigger background loading of remaining chunks for accuracy.
- Small sites still render inline immediately; large sites render almost instantly and stream in data.$3
When using --export-csv, the tool generates Excel-compatible CSV files in the csv/ subdirectory:
- all-pages.csv: Complete page data with all metrics
- links.csv: All links from all pages (internal/external, with anchor text)
- images.csv: All images from all pages (with alt text status)
- headings.csv: All headings from all pages (with levels)
- issues.csv: All issues by page with severity levelsCSV files are RFC 4180 compliant and can be opened in Excel, Google Sheets, or any spreadsheet application.
$3
- all-pages.csv: url, status, title, titleLength, metaDescription, descriptionLength, h1Count, wordCount, internalLinks, externalLinks, images, imagesWithoutAlt, responseTime, redirects, canonicalUrl, robotsDirectives, issuesCount, issues
- links.csv: pageUrl, linkUrl, anchorText, rel, isInternal, isNofollow, status
- images.csv: pageUrl, imageSrc, altText, hasAlt, fileSize
- headings.csv: pageUrl, level, text
- issues.csv: pageUrl, issue, severityArchitecture
The project is organized into modular components:
`
src/
βββ cli.ts # CLI entry point with Commander
βββ crawler.ts # Website crawling with performance tracking
βββ parser.ts # Comprehensive HTML metadata extraction
βββ analyzer.ts # Advanced SEO issue detection and categorization
βββ reporter.ts # HTML report generation with Handlebars
βββ exporter.ts # CSV export functionality (NEW)
βββ types.ts # TypeScript type definitions
βββ utils/
βββ urlUtils.ts # URL normalization and filteringtemplates/
βββ summary.hbs # Interactive tabbed summary with sortable tables (NEW)
βββ page.hbs # Enhanced page detail template with all metrics (NEW)
`$3
1. Separation of Concerns: Each module has a single, well-defined responsibility
2. Memory Efficiency: Pages are parsed immediately after fetching; only metadata is stored
3. Error Resilience: Network and parsing errors don't stop the entire crawl
4. Extensibility: Modular design allows easy addition of features like JS rendering or new output formats
Technology Stack
- TypeScript: Type-safe development
- Axios: HTTP client for page fetching
- htmlparser2 + css-select: Fast DOM-lite HTML parsing (low memory, high throughput)
- Commander: CLI framework
- Handlebars: HTML templating
- p-limit: Concurrency control
- Chalk & Ora: Beautiful CLI output
SEO Best Practices
This tool is based on SEO best practices from:
- Google's Search Central documentation
- Industry-standard character limits for titles (60 chars) and descriptions (160 chars)
- Common SEO audit methodologies used by tools like Screaming Frog, Ahrefs, and SEMrush
$3
- Unique Titles & Descriptions: Every page should have unique, descriptive metadata
- Optimal Length: Titles should be 20-60 characters, descriptions 50-160 characters
- Canonical Tags: Use self-referential canonicals to avoid duplicate content issues
- Robots Directives: Avoid conflicting directives; verify noindex pages are intentional
- Structured Data: Ensure JSON-LD is valid JSON and properly formatted
- hreflang: For multilingual sites, implement reciprocal hreflang tags
Future Enhancements
Possible future enhancements:
- π JavaScript Rendering: Support for SPAs using Puppeteer/Playwright
- π€ robots.txt Compliance: Automatic robots.txt parsing and adherence
- π Advanced Link Checking: Actually validate external links (not just detect 404s)
- π Progress Tracking: Real-time crawl progress with ETA
- π¨ Custom Report Themes: User-configurable report styling
- π Plugin System: Allow custom analyzers and reporters
- βοΈ Cloud Integration: Deploy as a web service or integrate with CI/CD pipelines
- π Historical Tracking: Compare crawls over time to track improvements
- π Advanced Schema Validation: Validate JSON-LD against schema.org types
- π± Mobile vs Desktop: Compare mobile and desktop rendering
Comparison to Screaming Frog
This tool now implements 85-90% parity with Screaming Frog's core SEO analysis features (excluding external API dependencies):
| Feature | This Tool | Screaming Frog |
|---------|-----------|----------------|
| Core Analysis |
| Page crawling | β
| β
|
| Title/Description analysis | β
(+ pixel width) | β
|
| Heading extraction (H1-H6) | β
(+ duplicates) | β
|
| Image alt text analysis | β
| β
|
| Internal/External links | β
| β
|
| Response times & redirects | β
| β
|
| Content metrics | β
(+ readability) | β
|
| Canonical URL analysis | β
(detailed) | β
|
| Robots directives | β
| β
|
| hreflang validation | β
(partial) | β
|
| Advanced Analysis |
| Security analysis | β
(HTTPS, headers, mixed content) | β
|
| URL quality checks | β
(15+ checks) | β
|
| Duplicate content detection | β
(exact + near) | β
|
| Orphan page detection | β
| β
|
| Weak anchor text | β
| β
|
| HTML validation | β
(structure, DOM depth) | β
|
| Pagination analysis | β
(partial) | β
|
| Soft 404 detection | β
| β
|
| Lorem ipsum detection | β
| β
|
| Export & Reporting |
| CSV export | β
(5+ files) | β
|
| Interactive HTML reports | β
| β (static) |
| Severity-based filtering | β
| β
|
| Additional Features |
| Free & open source | β
| β (freemium) |
| Command-line interface | β
| β
(paid) |
| Readability metrics | β
(3 formulas) | β |
| Content-to-code ratio | β
| β
|
| JavaScript rendering | β | β
|
| robots.txt validation | β
| β
|
| Sitemap analysis | β
| β
|
| PageSpeed/Lighthouse | β | β
(paid) |
| Google Search Console | β | β
(paid) |
| Google Analytics | β | β
(paid) |
| External link checking | β | β
|
Summary: This tool implements 220+ SEO checks covering on-page SEO, content quality, security, URL quality, link analysis, schema validation, robots.txt compliance, and sitemap analysis. It excels at static HTML analysis but doesn't include JavaScript rendering or external API integrations (PageSpeed, GSC, GA). See
docs/SCREAMING_FROG_PARITY.md for details.$3
β οΈ Important: This crawler analyzes static HTML only (like Screaming Frog's default mode). It does not execute JavaScript.
Impact on External Scripts Detection:
- β
Detects scripts in the initial HTML (