SQLite FetchQueue Implementation for Simplecrawler
npm install sqlite-simplecrawler-queue
npm install git+https://github.com/LeMoussel/SQLite-simplecrawler-queue#master
`
Install from npm
`
npm install --save SQLite-simplecrawler-queue
`
Usage
All you need is the database information such as database file
`js
try {
const sqliteDatabaseName = 'crawlsite.sqlite3'
// Drop Database if exist
SQLiteFetchQueue.dropDatabase(sqliteDatabaseName)
// Connect to a disk file database, you pass the path to the database file.
const crawlerQueue = new SQLiteFetchQueue(sqliteDatabaseName)
// Initialization of the database
crawlerQueue.init()
// Initializing simplecrawler
const crawler = new Crawler('http://example.com')
crawler.maxDepth = 3
crawler.allowInitialDomainChange = false
crawler.filterByDomain = true
crawler.queue = crawlerQueue
crawler.start()
} catch (err) {
console.error(err)
}
`
Test
npm test. Check test folder for extra usages.
Additional utilities
- Drop the queue using dropQueue method.
`js
// Connect to a disk file database, you pass the path to the database file.
const crawlerQueue = new SQLiteFetchQueue('sqliteDatabaseName', 'queue')
// Drop 'queue' table
crawlerQueue.dropQueue
// Initialization of the database
crawlerQueue.init()
`
- Drop the database using SQLiteFetchQueue.dropDatabase static method.
`js
// Drop Database if exist
SQLiteFetchQueue.dropDatabase('sqliteDatabaseName')
// Connect to a disk file database, you pass the path to the database file.
const crawlerQueue = new SQLiteFetchQueue('sqliteDatabaseName', 'queue')
// Initialization of the database
crawlerQueue.init()
`
- Export the flexible queue system to disk in a JSON file.
`js
// Flexible queue system which can be frozen to disk
crawlerQueue.freeze('./test/www.test.com.sqlite3.json', (err, result) => {
if (err) {
console.error(err)
}
console.log(Number of rows saved to JSON File: ${result})
})
`
- Import from a frozen JSON file on disk.
`js
// Flexible queue system which can be defrosted from disk
crawlerQueue.defrost('./test/www.test.com.sqlite3.json', (err, result) => {
if (err) {
console.error(err)
process.exit(1)
}
console.log(Number of rows inserted: ${result})
})
``