This JavaScript library simplifies the extraction of HTML Meta and OpenGraph tags from HTML content or URLs.
npm install @jcottam/html-metadata





@jcottam/html-metadata is a lightweight, TypeScript-first JavaScript library for extracting HTML meta tags, Open Graph tags, and other metadata from HTML content or URLs. Perfect for social media sharing, SEO analysis, and web scraping applications.
Compatibility: Works seamlessly with Node.js (CommonJS) and modern browsers (ES6+).
- 🚀 Fast & Lightweight - Built on Cheerio for optimal performance
- 📱 Open Graph Support - Extract all Open Graph meta tags for social media
- 🎯 TypeScript Ready - Full type definitions and IntelliSense support
- 🌐 URL & HTML Support - Extract from URLs or HTML strings directly
- 🔧 Configurable - Customizable extraction with filtering and timeout options
- 🛡️ Error Resilient - Graceful handling of malformed HTML and network errors
- 📦 Zero Dependencies - Only depends on Cheerio for HTML parsing
``sh`
npm install @jcottam/html-metadata
`typescript`
import { extractFromUrl, extractFromHTML } from "@jcottam/html-metadata"
`javascript`
const { extractFromUrl, extractFromHTML } = require("@jcottam/html-metadata")
`typescript
import { extractFromUrl } from "@jcottam/html-metadata"
// Basic usage
const metadata = await extractFromUrl("https://www.retool.com")
console.log(metadata)
// Output: { lang: "en", title: "Retool", og:title: "...", og:description: "...", ... }
// With options
const options = {
timeout: 5000, // 5 second timeout
metaTags: ["og:title", "og:description", "og:image"], // Only extract specific tags
}
const filteredMetadata = await extractFromUrl("https://example.com", options)
`
`typescript
import { extractFromHTML } from "@jcottam/html-metadata"
const html =
const metadata = extractFromHTML(html)
console.log(metadata)
// Output: {
// lang: "en",
// title: "My Website",
// "og:title": "My Amazing Website",
// "og:description": "This is a brief description",
// "og:image": "https://example.com/image.jpg",
// favicon: "/favicon.ico"
// }
`$3
`typescript
const html = ''
const options = { baseUrl: "https://example.com" }
const metadata = extractFromHTML(html, options)
console.log(metadata.favicon) // "https://example.com/favicon.ico"
`API Reference
$3
####
extractFromHTML(html: string, options?: Options): ExtractedDataExtracts metadata from an HTML string.
Parameters:
-
html (string): The HTML content to parse
- options (Options, optional): Configuration optionsReturns:
ExtractedData - Object containing extracted metadata####
extractFromUrl(url: string, options?: Options): PromiseExtracts metadata from a URL by fetching the HTML content.
Parameters:
-
url (string): The URL to fetch and extract metadata from
- options (Options, optional): Configuration optionsReturns:
Promise - Promise that resolves to extracted metadata or null if extraction fails$3
####
Options`typescript
type Options = {
/* Base URL for resolving relative links (e.g., favicon, apple-touch-icon) /
baseUrl?: string
/* Fetch timeout in milliseconds for URL extraction /
timeout?: number
/* Specific meta tags to extract. If not provided, all meta tags will be extracted /
metaTags?: string[]
}
`####
ExtractedData`typescript
type ExtractedData = {
/* Language attribute from the HTML tag /
lang?: string
/* Page title from the title tag /
title?: string
/* Favicon URL /
favicon?: string
/* Apple touch icon URL /
"apple-touch-icon"?: string
/* Open Graph and other meta tag properties /
[key: string]: string | undefined
}
`$3
`json
{
"lang": "en",
"title": "Retool | The fastest way to build internal software.",
"og:type": "website",
"og:url": "https://retool.com/",
"og:title": "Retool | The fastest way to build internal software.",
"og:description": "Retool is the fastest way to build internal software. Use Retool's building blocks to build apps and workflow automations that connect to your databases and APIs, instantly.",
"og:image": "https://d3399nw8s4ngfo.cloudfront.net/og-image-default.webp",
"favicon": "/favicon.png",
"apple-touch-icon": "/apple-touch-icon.png"
}
`Browser Usage & CORS
When using
extractFromUrl in browsers, you may encounter CORS restrictions. To bypass CORS:1. Server-side usage: Run
extractFromUrl on a server
2. Proxy services: Use a CORS proxy like AllOrigins
3. Browser extensions: Use CORS-disabling browser extensions for developmentError Handling
The library handles errors gracefully:
`typescript
// Network errors return null
const result = await extractFromUrl("https://invalid-url.com")
if (result === null) {
console.log("Failed to fetch or parse the URL")
}// Malformed HTML is handled gracefully
const metadata = extractFromHTML(
"
)
console.log(metadata["og:title"]) // "Test"
`Supported Meta Tags
The library extracts the following types of metadata:
- HTML attributes:
lang from tag
- Title: Content from tag
- Favicon: href from tags
- Apple Touch Icon: href from tags
- Meta tags: All tags with name or property attributes
- Open Graph: All og:* properties
- Twitter Cards: All twitter:* properties
- Custom meta tags: Any custom meta tags you defineDevelopment
$3
- Node.js 18+
- npm
$3
`bash
git clone https://github.com/jcottam/html-metadata.git
cd html-metadata
npm install
`$3
`bash
npm run build # Build the library
npm test # Run tests
npm run release # Release new version (manual)
`$3
This project uses automated dependency management and releases:
- Renovate Bot: Automatically updates dependencies and creates pull requests
- GitHub Actions: Automatically releases new versions when changes are pushed to main
- Manual Release: Use
npm run release for immediate releases or specific version bumps$3
The project uses Vitest for testing. Run tests with:
`bash
npm test
`Dependencies
- Cheerio: Fast, flexible HTML parsing
- Vitest: Next-generation testing framework
- Rollup: Module bundler for multiple formats
Contributing
We welcome contributions! Please follow these guidelines:
1. Fork the repository and create a feature branch
2. Make changes and ensure tests pass (
npm test`)- Follow TypeScript best practices
- Add JSDoc comments for new functions
- Ensure all tests pass
- Update README for new features
- Use conventional commit messages
MIT License - see LICENSE.md for details.