bitfield database with rank+select over persistent storage
npm install bitfield-dbbitfield database with rank+select over persistent storage
based on [roaring+run][] and [fenwick trees][] with 0 or 1 based numbering as in
the [louds][] paper
You might want to use a database like this when you have data like an inbox
and you'd like to sort messages into different categories like "read", "unread",
"spam", and "archive" and also where you might want to calculate totals for each
category (rank) and skip to the next or previous message in each category
(implemented internally using select+rank).
Pure javascript implementation and works using the [random-access][] suite of
adaptors. To use this module in the browser, you can use [random-access-web][].
[roaring+run]: https://arxiv.org/pdf/1603.06549.pdf
[fenwick trees]: https://en.wikipedia.org/wiki/Fenwick_tree
[louds]: https://www.cs.le.ac.uk/people/ond1/XMLcomp/confersWEA06_LOUDS.pdf
[random-access]: https://github.com/random-access-storage/random-access-storage
[random-access-web]: https://github.com/random-access-storage/random-access-web
`` js
var Bitfield = require('bitfield-db')
var raf = require('random-access-file')
var bf = new Bitfield(raf('/tmp/bitfield.db'))
;[5,10,15,20,23,24,26,27].forEach(x => bf.add(x))
bf.flush(function (err) {
if (err) return console.error(err)
bf.rank(24, function (err, res) {
if (err) console.error(err)
else console.log(rank(24) = ${res})select(5) = ${res}
})
bf.select(5, function (err, res) {
if (err) console.error(err)
else console.log()predecessor(24) = ${res}
})
bf.predecessor(24, function (err, res) {
if (err) console.error(err)
else console.log()successor(24) = ${res}
})
bf.successor(24, function (err, res) {
if (err) console.error(err)
else console.log()`
})
})
output:
``
rank(24) = 5
select(5) = 24
successor(24) = 26
predecessor(24) = 23
This library supports both one-based and zero-based numbering schemes.
All the methods with a trailing 1 in their name use one-based numbering.
All the methods with a trailing 0 in their name use zero-based numbering.
One-based numbering is the default. Under this scheme, rank1(x) counts the< x
number of elements contained in the set . When an element is in the set, it1
is a "one" because the entry for its bitfield (or array or run) is "set" as a.
The methods rank(), select(), predecessor(), and successor() are aliasesrank1()
for the corresponding methods: , select1(), predecessor1(), andsuccessor1().
Under zero-based numbering, rank0(x) counts the number of elements< x
(non-negative integers) NOT contined in the set which are . When an element0
is not contained in the set, it is a "zero" because the entry for its bitfield
(or array or run) is "unset" as a .
``
var Bitfield = require('bitfield-db')
var MultiBitfield = require('bitfield-db/multi')
Create a new bitfield-db instance bf from a [random-access][] store instance
and optionally:
opts.bits - maximum size of numbers in the set (2*bits). default: 32opts.prefix
* - string prefix to prepend to all keys stored in the databaseopts.db
* - [tinybox][]-compatible instance to store data. Otherwise a
database is created automatically.
Due to limitation of javascript's builtin number type, setting bits more than
50 may cause errors. Operations will be faster and take less storage with a
smaller bit size.
If you use opts.prefix, make sure to properly delimit your records by passing
a trailing delimiter and disallowing the delimiter user input.
[tinybox]: https://github.com/hyperdivision/tinybox
Add x to the set in the current batch.
Only positive integers including zero up to the precision set as opts.bits arex
allowed for .
You will need to bf.flush(cb) the database before queries will reflect this
action.
Delete x from the set in the current batch.
Only positive integers including zero up to the precision set as opts.bits arex
allowed for .
You will need to bf.flush(cb) the database before queries will reflect this
action.
Commit the current batch to the database. cb(err) is called when the operation
completes.
When opts.sync === false, only queue the underlying database writes, do not
flush the underlying database. Use this option if you need fine-grained control
over flushing, for example if you want to store multiple bitfields or other
pieces of data on the same database.
Test whether x is in the set. cb(err, exists) is called with a booleanexists or an error err.
Get the number of elements i in the set that are less than x ascb(err, i).
This corresponds to searching for the number of 1s < x in the bitfield, sorank1
the alias is called .
aliases:
* bf.rank1(x, cb)
Get the number of non-negative integer elements i NOT in the set that are lessx
than as cb(err, i).
This corresponds to searching for the number of 0s < x in the bitfield, sorank0
this method is called .
Find a value of x where rank(x) = i as cb(err, x).
If there is no element x in the set where rank(x) = i, you will receive the-1
value .
aliases:
* bf.select1(i, cb)
Find a value of x where rank0(x) = i as cb(err, x).
If there is no element x in the set where rank0(x) = i, you will receive the-1
value . Note that zeros extend unbounded past the largest value in the set
(the last 1 in the bitfield).
Return the greatest element y in the set where y < x as cb(err, y).
aliases:
* bf.predecessor1(x, cb)
* bf.pred(x, cb)
* bf.pred1(x, cb)
* bf.prev(x, cb)
* bf.prev1(x, cb)
Return the greatest non-negative integer element y NOT in the sety < x
where as cb(err, y).
aliases:
* bf.pred0(x, cb)
* bf.prev0(x, cb)
Return the smallest element y in the set where y > x as cb(err, y).
aliases:
* bf.successor1(x, cb)
* bf.succ(x, cb)
* bf.succ1(x, cb)
* bf.next(x, cb)
* bf.next1(x, cb)
Return the smallest non-negative integer element y NOT in the sety > x
where as cb(err, y).
aliases:
* bf.succ0(x, cb)
* bf.next0(x, cb)
Create a new multi instance to open multiple bitfields on the same databaseflush()
under different prefixes with bound flushes. This way you can move membership
between different sets on the same underlying database .
Open a bitfield database stored under prefix.
You should take care to delimit your prefixes to prevent clashes, for example by
using a trailing delimiter such as ! and disallowing ! from the prefix area
if it comes from user input.
Close a bitfield under prefix`. Flushes will no longer include writes from this
database.
Flush all of the open databases in a single underlying database flush.
license zero parity
and apache 2.0
(contributions)