Showing 1-20 of 25 packages
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
TypeScript definitions for simplecrawler
MongoDB FetchQueue Implementation for Simplecrawler
MongoDB queue implementation for simplecrawler
A web crawler forked from simplecrawler.
SQLite FetchQueue Implementation for Simplecrawler
Very straightforward web crawler. Uses EventEmitter. Based on Simplecrawler but using a distributed queue system.
Finds broken links and resources on websites
A web crawler forked from simplecrawler.
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
Implements https://github.com/simplecrawler to get all html links within the domain
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
Forked version of https://github.com/ciffard/node-simplecrawler.
MongoDB queue for Node Simple Crawler
TypeScript definitions for sitemap-generator
Very straightforward, event driven web crawler. Features a flexible queue interface and a basic cache mechanism with extensible backend.
Very straigntforward web crawler. Uses EventEmitter. Generates queue statistics and has a basic cache mechanism with extensible backend. This version is forked in order to add the ability to filter by referrer url
Node Web Crawler is a web spider written with Nodejs. It gives you the full power of jQuery on the server to parse a big number of pages as they are downloaded, asynchronously. Scraping should be simple and fun!
Follows and collects breadcrumbs accross the web
Creates a snapshot of a url with HTML suffix removed in favour of directories