A fast layout algorithm for beeswarm plots
npm install accurate-beeswarm-plotThis module calculates a two-dimensional beeswarm arrangement
of a one-dimensional dataset. For example, suppose you have this
collection of data points:
!A one-dimensional scatter plot
The module calculates y-positions avoiding overlap so that the dataset can
be viewed more clearly:
The main design goals of the module are to:
- represent values precisely (unlike some force-directed beeswarm layouts)
- pack the points together tightly without overlap
- run quickly
The module does not plot the points; you can use any plotting library to do this.
AccurateBeeswarmFirst, npm install accurate-beeswarm-plot and import the module:
``javascript`
import { AccurateBeeswarm } from 'accurate-beeswarm-plot';
The latest release
has a non-module version of the script
accurate-beeswarm-plot.nomodule.js that does not require the import
statement.
It is assumed that the data (x) axis of the plot will be horizontal. For
a vertical beeswarm, simply swap x and y axes when plotting.
To calculate a beeswarm arrangement, you need an array of items datafn
and a function that takes an element of data and returns its x position.AccurateBeeswarm
Construct an objects using new AccurateBeeswarm(data, radius, fn).calculateYPositions()
The method returns an array of objects; each object contains fieldsdatum (an element of data), x (the x position given by fn(datum)), and y
(the computed y position).
`javascript`
let data = [{value: 2, name: "A"}, {value: 3, name: "B"}];
let radius = 5;
let fn = d => d.value;
let result = new AccurateBeeswarm(data, radius, fn)
.calculateYPositions();
The gives a result of
`javascript`
[
{"datum":{"value":2,"name":"A"},"x":2,"y":0},
{"datum":{"value":3,"name":"B"},"x":3,"y":9.9498743710662}
]
A one-sided layout with only positive y-values can be obtained by calling
the oneSided method before calculateYPositions.
`javascript`
let result = new AccurateBeeswarm(data, radius, fn)
.oneSided()
.calculateYPositions();
An alternative arrangement with ties broken randomly rather than preferring
points with low values (this sometimes reduces the "honeycomb" appearance):
`javascript`
let result = new AccurateBeeswarm(data, radius, fn)
.withTiesBrokenRandomly()
.calculateYPositions();
!A beeswarm plot using random tie-breaking
You can also break ties by preferring items that occur early in the data array.
`javascript`
let result = new AccurateBeeswarm(data, radius, fn)
.withTiesBrokenByArrayOrder()
.calculateYPositions();
You can view an example using the Layer Cake framework
on the Svelte REPL.
The algorithm places data points one by one. At each step, a point that can be
placed as close to the y=0 line as possible is chosen and placed. By default,
ties are broken by choosing points with low x values; if withTiesBrokenRandomly()`
is used, ties are broken using a random tie breaker which is given to each point
before running the algorithm.
The algorithm uses a priority queue to quickly select the next point to be placed
at each step.
If you have more than about 1000 data points, it's helpful to run the algorithm
on the server side to avoid a long-running computation in the browser.
d3-beeswarm is another library
with similar goals. The figure below uses d3-beeswarm to calculate the layout.
To the best of my understanding, d3-beeswarm places the points in the order given,
rather than using a dynamic strategy at each step to choose which point to place.
!A beeswarm plot using d3-beeswarm
Beeswarms can also be produced using force layout, for example using
d3-force, as in the example below and
this Layer Cake example.
A disadvantage of this approach is that it tends to represent values imprecisely.
In the figure below, orange points have x-values that are incorrect by at least 0.5.
!A beeswarm plot using force layout
The beeswarm package for R implements
a number of beeswarm strategies.
And of course, you could just use a histogram!
Issues
and discussions
are on GitHub.