Configurable website scraper in typescript
npm install website-scrap-engine* Main thread
* resource downloading in queue
* process after download
* save binary resources to disk
* send other resources to worker thread
* enqueue non-duplicated resource from worker thread
* Worker thread
* receive downloaded resource from main thread
* process after download
* parse html, css, etc.
* collect referenced resources
* process and filter referenced resources before download
* send referenced resources to main thread
* save resources to disk