Showing 1-19 of 19 packages
Parse WARC (Web Archive Files) as a node.js stream
Parse And Write Web Archive Records (WARC) Files
Streaming web archive (WARC) file support for modern browsers and Node.
Parse And Write Web Archive Records (WARC) Files
Archive any webpage to markdown, metadata, media, and WARC
JavaScript module and CLI tool for working with web archive data using the WACZ format specification.
This library has been factored out of [ArchiveWeb.page](https://webrecorder/archiveweb.page) and represents the core service worker implementation necessarily for high-fidelity web archiving.
Library server and an archivist browser controller.
NodeJs Client for CommonCrawl Index API
A Web Component to render CDX Summary JSON files
🍨 High-fidelity, browser-based, single-page web archiving library and CLI.
HTTP related plugins for sugarcube.
A fork of @harvard-lil/scoop that is optimized for running on AWS lambda
A parser and generator for (Internet Archive) CDX files.
Allow proper decompression of concatenated gzip files
This library has been factored out of [ArchiveWeb.page](https://webrecorder/archiveweb.page) and represents the core service worker implementation necessarily for high-fidelity web archiving.
A passive web scanner
module for reading warc files in a streaming fashion
read WARC file records as a pull-stream