Spacecat Shared - HTML Analyzer

Analyze HTML content visibility for AI crawlers and citations. Compare what humans see on websites versus what AI models (ChatGPT, Perplexity, etc.) can read when crawling pages for citations.

Installation

``bash npm install @adobe/spacecat-shared-html-analyzer`

`Usage`

`javascript import { analyzeTextComparison, calculateStats, calculateBothScenarioStats } from '@adobe/spacecat-shared-html-analyzer';

// Compare initial HTML (what crawlers see) vs rendered HTML (what users see) const originalHtml = '

`Title`

';
const currentHtml = 'Title
Dynamic content loaded by JS
';
// Full text analysis (original chrome extension logic)
const analysis = await analyzeTextComparison(originalHtml, currentHtml);
console.log(analysis.textRetention); // 0.5 (50% text retention)
console.log(analysis.wordDiff); // Detailed word differences
// Basic comparison statistics
const stats = await calculateStats(originalHtml, currentHtml);
console.log(stats.citationReadability); // 50 (50% of content visible to AI)
console.log(stats.contentIncreaseRatio); // 2.3 (2.3x more content in rendered)

// Both scenarios (with/without nav filtering) const bothStats = await calculateBothScenarioStats(originalHtml, currentHtml); console.log(bothStats.withNavFooterIgnored.contentGain); // "2.3x" console.log(bothStats.withoutNavFooterIgnored.missingWords); // Number of missing words`

`Environment Support`

This package works in both Node.js and browser environments (including Chrome extensions):

- Node.js: Uses Cheerio for robust HTML parsing - Browser/Chrome Extensions: Uses native DOMParser with automatic fallback

`API Reference`

`$3`

#### analyzeTextComparison(initHtml, finHtml, ignoreNavFooter)

Comprehensive text analysis between two HTML versions (original chrome extension logic).

Parameters: -initHtml(string): HTML as seen by crawlers/AI -finHtml(string): HTML as seen by users (fully loaded) -ignoreNavFooter (boolean, default: true): Remove nav/footer elements

Returns: Promise

@adobe/spacecat-shared-html-analyzer

Spacecat Shared - HTML Analyzer

Installation

Usage

Title

Title

Environment Support

API Reference

$3

$3

Technical Implementation

$3

$3

$3

Build Scripts

$3

$3

Version Information

$3

$3

Testing

License

@adobe/spacecat-shared-html-analyzer

Spacecat Shared - HTML Analyzer

Installation

Usage

Title

Title

Environment Support

API Reference

$3

$3

Technical Implementation

$3

$3

$3

Build Scripts

$3

$3

Version Information

$3

$3

Testing

License

`Usage`

`Title`

`Environment Support`

`API Reference`

`$3`

`Technical Implementation`

`$3`

`Version Information`

`$3`

`Testing`

`Usage`

`Title`

`Environment Support`

`API Reference`

`$3`

`Technical Implementation`

`$3`

`Version Information`

`$3`

`Testing`