a simple web scraper, friendly usage.
npm install tiny-scraper
npm install tiny-scraper
`API
$3
return a router to parse specified page.
`javascript
const { createRouter } = require('tiny-router');
const router = createRouter();
`
$3
match a site base uri, return a function to filter urls in this site. please refer to path-to-regexp document for route expression format.
#### Parameters
baseUri* `javascript
const matchGithub = router.match('https://github.com')matchGithub(
'/zhangmq/tiny-scraper', //route expression
function* (req, res, params, query) {
yield storage(res.data) //storage just for demo, you can implement it by yourself.
return [/ parsed urls /];
}
);
`
$3
create a scraper.
#### Parameters
options* a object contains config fields.
maxRequest* max requests count paralleled.
requestDuration* min request duration, if request completed early, will wait until specified duration.
router* you implemented router.
downloader* method to request page, config => responsePromise. example: axios.request
`javascript
const { createScraper } = require('tiny-scraper');
const scraper = createScraper({
maxRequest: 1,
requestDuration: 2000,
router,
downloader: axios.request
});scraper.tasks$([/ seed tasks /])
``